Thursday, October 22, 2009

3 Common Census Problems and How to Fix Them

I found some free time, so here is a post about census manipulation.

In the insurance industry, you deal with censuses... er, censi... let's call them enrollment listings.  You deal with enrollment listings a great deal.  Insurance carriers need to know how old everyone is, their gender, enrollment tier, zip code, and benefit elections in order to properly underwrite the case.  I myself use them for setting COBRA rates, modeling plan design changes, and other mathematical sleight of hand.

The problem is that when you work on self funded groups, they usually use an administrator that is a separate entity from the insurance carrier.  As far as the client is concerned, the administrator is only processing the claims.  Other features like response time, quality of reporting, case management notes, those are the consultant's problem and they are always a problem.

When looking at a list of proposals from administrators, every client naturally chooses the cheapest option so every report that we deal with is the worst report ever from the most bare bones administrators that we'll tolerate working with.  For example, even if I've asked for an enrollment listing 20 times I still have to specify every individual field I want and it's still wrong and no two listings ever look the same.  I would love to just pass this problem onto the carriers when we don't need the census ourselves, but 12 out of 10 times (most proposal requests go to a shared email address like, so I hear about every problem way more than a fair universe would allow) they just send the census back to us with an email much like the following:

Hi David,

I can't use this census.  Can you fix it?


-Jeffrey Out

There are usually several emails after this where I force Jeffrey to tell me what he can't use and how he needs it fixed, but you get the idea.  Just remind yourself how much you're getting paid to deal with these kinds of people and pour a shot of whiskey into your mocha; violence isn't the answer.

The fixing is what actually ties this post into Excel.  Here are the top 3 enrollment listing problems I come across:
  • Valus for some fields only appear once because the census was created in a pivot table and they need it on every line.
  • There is some problem with the date/age format.
  • There are no tiers.
The first one is a pretty easy problem to fix.  Imagine this:

The problem is the "Plan" field.  You need the value for the plan to show up on the line for every member on that plan, not just the top one.  Now imagine there are 18 different plans for 870 different members so you can't just fill the values down.  What is one to do?

Now there may be in fact a better way to do this, and if there is I would love to know it, but this is how I taught myself to solve this problem.  In the first blank field, set it equal to the cell directly above it that has the value, in this case B3.

 After doing this, copy the formula.  Then select the entire range of cells that we need to fill in the blanks for, and in the menu select Edit > Go To, then click the Special button.

From the Special window, select the option button for Blanks and hit okay.

This will select only the blank cells in the area you had previously selected.  Now, just paste in the formula and all of the cells will equal the proceeding cell, filling your Plan values down for every appropriate member.  Now simply click the letter B above to select the whole column, hit CTRL+C to copy it, and then hit ALT+E, S, V to paste the column back in as values so it won't get scrambled during sorting.  If you don't know all of the keyboard shortcuts for every paste special option, you should learn them.  It will save you a million headaches.  If I could only recommend 3 things that everyone should know in Excel, it would probably be Pivot Tables, Paste Special, and V-Lookups.

So, that is problem one solved.  The second problem can be a little more tricky at times.  Imagine, if you dare, the following:

Here we have an administrator using a Tandy or Amiga for their primary reporting system, the output of which they've twisted into a cruel approximation of a usable census.  Problems?  Let's start with the zip code:  people have 9, 8, 5, and 4 digit zip codes.  Apparently somebody imported them as text and the leading zeros were cut off.  This happens a lot, and the carriers seem to have no way of fixing it themselves.  Actually they probably decided it was cheaper to have me do it since they don't have to pay me.

Technically, there shouldn't ever be an 8 digit zip code because the dash would have caused it to be treated as text and the leading zero would have survived, but I'm trying to imagine a worst case scenario.

How do we fix this?  Well you only need the last 4 digits of a zip code if you're delivering mail (right?  I don't know...), so let's chop those off of the end.

First we'll use the SUBSTITUTE function to replace the dash with some white space.  We can trim this later.  Next we'll use the LEFT function to grab the first five characters starting from the left.  The nice thing about this is that it can handle even the string with less than 5 characters without throwing an error.

As I said, we also need to TRIM this to remove the white space for any 8 digit zip codes.  Now we can put back in those proceeding zeros.  For this we'll use the TEXT formula with the text set to "00000".  If there are 5 digits, Excel leaves them alone, but if there are 4, Excel places them as the last 4 and the first digit will be a zero, just what we're looking for.

The zip codes should now be usable, so we can just copy that column and paste it in as values so that they don't somehow get destroyed by the end user.

Now let's look at these dates:

Note that these dates have changed from previous pictures.  Suffice to say that the conditions under which I produce these blog posts are not ideal, specifically I do not own a copy of Excel and must use a remote connection to finish the ideas I start at lunch.  I know, I know, terrible...

Back to the problem at hand, not only are these dates text strings mixing letters and numbers, but the day field doesn't have a leading zero for the first 9 days of the month!  This could get painful.

Instead of creating a bunch of ugly if formulas, I'm just going to create a few new columns temporarily.  The first will grab the month using LEFT:

To get the day and year with a proceeding zero for the date, we're going to use SUBSTITUTE to pull out the month and then format it as TEXT using "000000":


Our next task is to turn that 3-letter month into something usable.  I'm going to replace it with the number of the month using a V-Lookup because our final destination, the date formula, can't use the text.  In some empty space, type the following:

Select those 4 cells and hover your pointer over the lower-right corner.  That plus symbol is what I call the pattern copier.  I call it that because that's what it does and I don't actually know the real name.  Anyway, once your cursor turns into the big + symbol,  click and drag down until you get to December.  This is your v-lookup table.

We're basically done now.  The last step is to use the DATE formula to create a usable D.O.B.  DATE accepts numbers as the arguments for year, month, and day, returning a date.  The year and day we can get simply using RIGHT and LEFT, while we'll use a vlookup for our month.  Remember to hit F4 to lock the range for your v-lookup.  I still forget to do that on an almost daily basis.

It should look like the following:

And there you have it.  As a quick bonus, my personal method of calculating a person's current age based on their date of birth is:


Just make sure you format it as a number; Excel will format it by default as a date and it will be more useless than my Google ads.

That leaves the third problem: no tiers.  This post is getting long though and I still want to watch The Unborn tonight, so that problem will have to wait for another day. It shouldn't take long though, so I may have time to put that up this weekend and even proof read this post...


No comments:

Post a Comment

Creative Commons License
David @ Work by David Montgomery is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.