In my previous post, I talked about the fact that no software package has ever been released “glitch free.” The bigger the project, the more likely there are to be glitches. One of the factors that was pointed out to me (and has been in the news) is the complexity of the system. Healthcare.gov has to coordinate information from numerous systems, as both in sending it and receiving it. In fact, most of the problems seem be related to the “data hub”, which is the main interface handling this. It’s job is (approximately) to take data from the web site, query other systems to verify things, then send it on to the various exchanges or state agencies to get an answer from them about pricing, and then back to the customer. Sounds simple? It isn’t. Leaving aside the reports of … ineptitude … on the part of the contractor responsible, it’s an epicenter for glitches to happen.
Almost two decades ago, I was a system administrator at a healthcare provider. It was a fairly large one, consisting of three hospitals and numerous clinics scattered around the state. We were also one of the first to try out a fairly new idea: The electronic medical record. There were all sorts of promise in that, particularly given the nature of our institution. The idea was wonderful – in fact, you still hear a lot about it. Back then, it was just beginning, but still, the promise was there. We could create one record, accessible at any of our clinics or hospitals, enabling up-to-date information to be available and saving a bundle on sending paper records around.
That was the promise, but the reality? Something else entirely. Leaving aside the fact that at the time “computer literacy” was not a common skill, particularly in doctors, the idea that we could link various clinical systems into a single medical record system ran into brick wall after brick wall. Consider the “core systems” I was responsible for: The dictation system, the transcription software, and the electronic signature package. My systems had to accept data feeds from the scheduling system, pathology, emergency, radiology, and cardiology departments. All of whom had their own “best of breed” systems. My systems fed data back to them, as well as into the electronic medical record. Sounds simple? It wasn’t. I learned to loathe the phrase “Yes, we do HL7 messaging!” Why? HL7 is a standard messaging protocol, but it sometimes wasn’t “native” to the particular package, or the “dialect” (the version) it “spoke” wasn’t the same that we did. Any combination of which could (and did) trip you up, and lead to errors happening. Which was why we had regular meetings of all the system administrators, to try to iron out those issues, and develop testing plans.
That’s why the “big savings” and “great idea” turned into a huge morass of delays and cost overruns. But here’s the other major headache we all had. It wasn’t enough to iron out the translations, and get the systems to talk to each other. The systems all had be working at the same time. If the pathology system was off-line for some reason, I would have a massive backlog of files waiting to be sent, which could cause problems in other systems, because they could be “holding up the line” for the files meant for them. The information needed to complete new files wasn’t there, so even more delays were created. An interface glitch in the medical record package could (and did) mean a huge backlog building up. A freeze in the signature system could mean a big backlog, and then stress on the network as the backlog was being cleared. That was just on my systems, the other systems administrators had their own sets of headaches along those lines.
I mention this because this is what is happening in the data hub, on a much larger scale. There are multiple systems which have to send information back and forth to each other through the hub. They’re all running different software and hardware, and they all have to be working at the same time, which wasn’t necessary before. So there are two potential “points of failure” right there. It’s complex, and it requires a lot of coordination. Apparently, that wasn’t the case, and there were also some problems with the contractor’s coding of it. Although there’s a figure of “500 million lines of code,” most computer experts scoff at that. But it’s still a complex system, and anyone who expected it to work “perfectly” out of the box was just fooling themselves. No “adding another server,” as Chuck Todd suggested wouldn’t have fixed it.
The good news, such as it is, is that the “problem child” has been identified, and people are now working on it. It may take a while, it may even mean a “rewrite from scratch,” for parts, but it’s being worked on. It’ll get better. The problem, as one commenter over at Little Green Footballs pointed out? When it’s fixed, nobody will notice it for a while, because it “just works.”