Everybody’s Got One

If you woke up this morning hoping for one more person’s take on all this ‘AI’ stuff, I guess it’s your lucky day.

You won’t find a(nother) rant about how large language models (LLMs) aren’t all that ‘intelligent’, how they pose an existential risk to humanity, make people dumber, are eroding our ability to build and maintain software, are eliminating millions of jobs thereby upending the global economy, or how the whole thing is a bubble floating over a pyramid (scheme) and built entirely on theft.

All of those things seem true to me to varying degrees (especially the theft part), but that’s not what this is about. So if a rant about one of those what you’re looking for, best keep looking.

Ok, I said no rant. I lied. For the record, I utterly abhor that a small number of sociopaths are building empires and fortunes based on what can only be described as blatant and intentional theft. It’s not the first time this has happened – see: robber barons – and it probably won’t be the last. But these people are “sorta my people” and my small part in enabling this reality fills me with no small amount of regret. Even if AI ends up being amazing (and to me the jury is most definitely still out), the original sin will remain.

If you’re early in a career in software and looking for guidance, or predictions about the course of the industry, there are lots of places to look. If you’re well into that career arc and trying to orient and navigate there are no shortage of thought-provoking and often depressing perspectives to consider.

If you’re looking to understand how these chat bots (and the large language models they’re built on) do the seemingly magical things they do, take a few hours (!) and let Andrej Karpathy explain.

All I’m offering is a (hopefully cogent and coherent) exposition of what’s in my head.

Caveat emptor.

I’m not an Artificial Intelligence expert. Like others “of that age” I had a dream of teaching machines to think. In my teens I voraciously consumed science fiction, read futurists like Hans Moravec, and was convinced we were probably “really close” to being able to build thinking machines. For a while I considered studying cognitive neuroscience on top of computer science, but realized it was probably biting off more than I could comfortably chew.

I was interested enough to do some related coursework during my computer science undergraduate, in the mid 90’s. At the time the practical state of the AI art were things like expert systems, and there was a bit of an “is it/isn’t it” tug-of-war with the information retrieval folks who focused on organizing information in ontologies and graphs. Image recognition and natural language processing were unsolved areas of active research, and you could tell the “AI people” ‘cause they wrote code in Prolog or Lisp.

I ended up being pretty disillusioned by the state of the art and didn’t give it much more time or attention for years.

I started paying attention again around ten years ago. Advances in compute (and a bit in algorithms) made things that had been “impossible” (or at least impractical) suddenly possible or “adjacent possible.” This was right around the time a bunch of non-PC gamers started to care about GPUs (Graphics Processing Units).

Since then I’ve built stuff – and helped teams build stuff – using machine learning (a term I strongly prefer over “AI”), and I’ve built and trained small models, and used the current (recent?) crop of large language models enough to have a sense of their current capabilities and limitations.

So I guess I’d describe myself as “not completely clueless.”

So… “AI.” Let’s start with two dirty little secrets.

Dirty-little-secret #1: In the decades I’ve been in the software industry, I’ve rarely loved the act of writing code.

I loved (and still love, mostly) solving problems for people who couldn’t solve those problems themselves. Writing code was a means to that end – not the end.

I didn’t hate writing code, but knowing I’d figured out a solution was the really rewarding bit. And (or maybe “So”) I was really never “the best coder” in a group. I was a “pretty good programmer” and I worked to develop good habits that let me collaborate with people who were better than me and only rarely feel like the idiot holding us back.

I also had some experience early in my career that forced me to realize that over time – especially as the people who wrote it disperse – code becomes more of a liability than an asset. So, in the long term, less can very much be more.

Dirty-little-secret #2: One of my few persistent “career goals” has been to put myself out of a job. To make my role unnecessary. I think of it as being “lazy in the long term” – willing to work hard on a problem today so I can stop working on or even thinking about that problem entirely “tomorrow.”

Despite the marketing hype, the current generation of LLM-based tools don’t have the potential to make “people like me” obsolete. What they do have is the potential to drastically reduce the number of people who need “people like me” to help them solve problems with computers and technology.

In a world that “software ate,” but where most people can’t self-service their software needs, these tools have tons of potential for disintermediation and empowerment. Disintermediation and empowerment seem good.

So you might think I’d be loving these tools.

I thought I would, too.

But, it turns out, I do not.

That’s not to say I hate them. I don’t. The opposite of love isn’t hate.  It’s indifference.

On a personal level, I’m mostly indifferent.

On the broader implications and impact of these tools and technologies it’s closer to say that I’m ambivalent, and from a professional perspective not actively embracing AI is potentially problematic. Those are both topics for another time.

I’ve come to realize that I just don’t enjoy using these tools, and I really struggle to convince myself that the value they have is worth their costs. (NB: Their actual costs. Not just the loss-leading “we’ll make it up in volume” fistfuls of dollars each month they’re priced at today.)

They’re too often opaque, capricious and unpredictable. making it unwise to trust their results.  That makes me not reach for them to answer questions, or solve problems, unless I’m already confident I know the answer.

I think about the output from an LLM as I might the writings of a hard-line political pundit. Everything has to be skeptically considered. Everything has to be fact checked. It turns out that without chasing the footnotes, you’ll never know if the reference material really says that, or even if it exists.

Maintaining the appropriate level of skepticism is real work.

I find that these tools transform work but don’t reliably reduce or eliminate it.

They turn writing English into reading and re-writing English.

They turn writing code into reading reasoning about and fixing code.

They turn fact and knowledge seeking into, well, fact and knowledge seeking.

Don’t read this as me saying these tools have no value. That’s not my point at all. I’m making a bounded statement about my experiences with these tools.

The other thing I’ve learned, which surprised me at first, is that using an LLM to answer a question, or write code, or solve a problem makes me feel … nothing.

I was drawn to computing, and ultimately studied and pursued it as a profession, because I found it rewarding. Not just financially – though getting well paid to do something I enjoyed was certainly not a bad thing. Solving problems – especially tricky problems – feeds my brain endorphins.

And who doesn’t love endorphins, right?

I love learning, and each time I attack a problem – win or lose – it changes me a bit. Teaches me something.

I get none of those rewards when I use these tools. The victory feels hollow. As if I’ve cheated. Or have been cheated.

A friend and former colleague said we’ve “…largely become a culture of answer seekers, not knowledge seekers. We want the answer, but don’t particularly care to understand why or how. This was a problem before AI.”

I think he’s right, and his observation touches a nerve. One of my most valuable (and most irritating) habits was instilled in me at a young age by my uncle Denis – an actual working scientist who told me to “Always ask why.”

Supressing that impulse, lessening the drive to understand, makes me… sad.

I stumbled across an analogy that resonated with me – using language models and chat bots to write or solve “thinking” problems for you, this author said, is like bringing a forklift to the gym to lift weights. If your only goal is to lift the weights, fantastic, job done – provided the model doesn’t drop the weight on someone’s toes, or decide to drive through the locker room instead. But if any part of the goal is to become a person who can lift weights … learning to drive a forklift is becoming someone who can drive a forklift, not someone who can lift weights.

If what you need to do is move lots of heavy things, over and over, day in and day out, and that’s all there is, by all means use a forklift. Just realize that what you’re getting good at is driving a forklift.

This might lead you to ask the entirely reasonable – “so what?”

Well, from a personal perspective, I just don’t use these tools much. I don’t pay for a chatbot and don’t imagine that changing. I occasionally ask Gemini questions – when it’s not outright fabricating things the model is pretty good at summarization. I’ve had Claude write code – especially when I already know what the code needs to do and the cost of verifying it’s “done it right” is lower than the cost of me just doing it.

But I don’t use AI every day. Or even most days.

I don’t ask a chatbot for feedback on my writing, for instance. I write it, read it, revise it, and sometimes ask other people to read and critique it too. So my writing has occasional typos, sometimes mixes metaphors (thanks, Matt!), and can be a bit awkward.

And that’s ok.

Over the years I’ve been writing, I’ve gotten better at it – and the point isn’t just to lift the weights.

Not All Movement Is Progress

I was having a conversation with a friend recently. We both work in “Big Tech.” Both our employers are sensitive to employees making remotely public statements – and his in particular is pretty notorious for over-reacting. So in an abundance of caution I’ll point out that this rambling represents neither of our companies – just a conversation between two people who’ve been in tech “a fair while.”

We were talking about the most recent operating system released by a Big Tech company that runs on their fruit-themed hardware.

The release has caused a bit of a kerfuffle – it no longer runs a set of applications that were supported by its predecessor. Like other transitions in this company’s past, this one was deliberate and foreshadowed across a couple years. If a customer depends on an application that this OS won’t run, they find an alternative, convince themselves they don’t really need that application, or they don’t upgrade. This last is problematic. Not in the short term – they’ll still get critical updates for their old operating system for a while – but eventually. And if they need to buy new fruit-themed hardware, that new hardware likely won’t run that old operating system. So those customers are one hardware failure away from running out of options.

It’s also generally been a bit of a bumpy release. The initial release had more than its fair share of issues, and even after a couple minor releases there are ongoing sources of customer pain and breakage. I’ve encountered some of these bumps personally, and I reached out to this friend – on a personal basis – to relate my anecdote.

I think it’s safe to say that most tech consumers don’t have personal contacts inside “Big Tech.” They can potentially contact support if they have a problem, but that’s where it ends.

“I don’t expect you to fix this,” I started, “I just want you to hear the unfiltered voice of your customer.” I went on to explain the problem, and how I thought they could have given their customers more options – as they had during past large technology transitions.

I pointed out that, from my perspective, this transition was different from the big transitions in the past. In the past, I argued, customers saw a difference, and that visibility made the changes – even the unwelcome ones – easier to understand. During the transition from their “classic” operating system to their NeXT (sic.) generation operating system in c. 2001, everything looked different. When they changed the CPU their machines were built around in c. 2006, customers bought new computers. Those changes were moments of transition – painful transitions for some customers – that enabled new things.

This time, a customer who “upgrades” their software gets to do less. And it’s pretty hard to explain to a customer how being able to do less enables new things.

That got me thinking about the idea of progress in computing, and software. This is a case where we’re “improving” a computing systems by making it do less than it could do before – and that feels like not progress.

Maybe I’m just old. Part of me can’t give up the Apple ][ that I could open the cover on, and basically understand from the component level up.

That’s not computers anymore.  And as magical as carrying the internet around in my pocket, or on my wrist, is – and it is – I think we’ve lost some valuable things along the way.

Postmortem of a catastrophic RAID failure

Wednesday of last week, I came home to find my three new 1TB hard disks waiting for me, destined to upgrade our ReadyNAS NV+.

Being a hot-plug-online-upgradable-all-singing-all-dancing sort of widget, I followed the recommended upgrade procedure and popped out one of the current 500GB drives, waited a few seconds, slotted one of the new 1TB replacements, waited ’till it started resynchronizing the volume, and went down to make dinner.

And spent the next several days picking up the pieces…

One critical bit of background – the NAS had three disks in a single RAID-5 volume. RAID-5 can tolerate one disk failure without data loss, but if two disks fail (regardless of the number of disks in the volume), kiss your data good bye.

When I went back upstairs after dinner to check on progress I discovered that the NAS had locked up, and completely dropped off the network. Wouldn’t answer it’s web management UI, and wasn’t responding to pings.

Hesitantly, I power-cycled it. It started booting, and hung about a quarter of the way through checking the volume.

After several reboot attempts all locking up at the same place, I applied a bit of coercion and convinced the box to boot. I checked the system logs and found nothing telling, removed and re-seated the new 1TB drive, and watched it start the resync again.

A couple hours later, sync still proceeding, I went to bed.

And woke the next morning to find the unit again fallen off the network.

Buried in the log messages – which I’d left scrolling past over night – was a warning that disk 2 was reporting SMART warnings about having to relocate failing sectors.

In other words, one disk of the three was being rebuilt while another one was busy dying.

At this point it became a race – would the rebuild complete (leaving me with two good disks, and intact data) before the failing one died completely.

In order to try to buy some insurance, I shut down the NAS, transplanted the failing drive into a spare PC, and started a disk-to-disk copy of it’s data onto the working 500GB disk I had removed at the start of this mounting disaster.

Despite valiant attempts by both dd_rescue and myrescue, the disk was dying faster than data could be retrieved, and after a day and a half of effort, I had to face the fact that I wasn’t going to be able to save it.

Fortunately, I had setup off-site backups using CrashPlan, so I had Vince bring my backup drive to work, and retrieved it from him on Friday.

Saturday was spent restoring our photos, music, and email (more later) from the backup.

Unfortunately, despite claiming to have been backing up Dawnises inbox, it was nowhere to be found in the CrashPlan backup set, and the most recent “hand-made” backup I found was almost exactly a year old (from her PC to Mac conversion). Losing a year of email is better than losing everything, but that seems like meager consolation under the circumstances.

By Saturday night I had things mostly back to rights, and had a chance to reflect on what had gone wrong.

The highlights:

1. SMART, as google discovered (and published) is a terrible predictor of failure. The drive that failed (and is being RMAd under warranty, for all the good it’ll do me) had never issued a SMART error before catastrophically failing.

2. In retrospect, I should have rebooted the NAS and done a full volume scan before starting the upgrade. That might have put enough load on the failing drive to make it show itself before I had made the critical and irreversible decision to remove a drive from the array.

3. By failing to provide disk scrubbing (a process whereby the system periodically touches every bit of every hard disk) the ReadyNAS fails to detect failing drives early.

4. While I had done test restores during my evaluation of CrashPlan, I had never actually done a test restore to Dawnise’s Mac. Had I done so, I might have discovered the missing files and been able to avoid losing data.

I have a support ticket opened with the CrashPlan folks, as it seems there’s a bug of some kind here. At the very least, I would have expected a warning from CrashPlan that it was unable to backup all the files in it’s backup set.

5. In my effort to be frugal, I bought a 500GB external drive to use as my remote backup destination – the sweet spot in the capacity/cost curve at the time.

Since I had more than 500GB of data, that meant I had to pick and choose what data I did and didn’t backup. My choices were ok, but not perfect. There’s some data lost which should have been in the backup set, but wasn’t due to space limitations.

6. CrashPlan worked well – but not flawlessly – and without it, I’d have been in a world of hurt. Having an off-site backup means that I didn’t lose my 20GB worth of digital photos, or several hundred GB of ripped music.

Aside from digital purchases, the bulk of the music would have been recoverable from the source CDs, but at great time expense. The photos would have just been lost.

7. In this case, the off-site aspect of CrashPlan wasn’t critical, but it’s easy to imagine a scenario where it would have been.

8. The belief that RAID improves your chances of retaining data is built largely on what I’m going to refer henceforth to as “The RAID fallacy” – that failure modes of the drives in the array are completely independent events. The reality is that many (most?) RAID arrays are populated with near-identical drives. Same manufacturer, same capacity (and model) , and often the same or very similar vintage. So the drives age together under similar work loads, and any inherent defect (like, say, a firmware bug that causes the drives not to POST reliably) is likely to affect multiple drives, which spells disaster for the volume.