More LLM Musings: Notes from the Frontier

4 min, 18 secs. 948 words


My LLM notes

Since the response to my last post was encouraging, I thought I’d expand on what I’ve been thinking about since then.

The future is already here – it’s just not evenly distributed.

AI adoption curve

I still encounter hesitation to adopt AI tools from senior folk. Change is difficult - just be aware that our industry doubles every (n) years. In (n) years, half the people you’ll interact with will have less than (n) years experience writing software. Their only lived experience will be using these types of tools as ‘AI-native’ developers.

Don’t say you weren’t warned.

I also see ‘but the LLMs put out buggy code’ arguments - I’ve always chuckled at these because I think some of us have forgotten who’s staring back at us in the mirror. Our entire careers have been predicated on putting out buggy code. :)

Scaling AI development

I was listening to my buddy Harper on the Intelligent Machines podcast - it’s a fun podcast, go have a listen. He referenced a friend that had a Github repository with hundreds of PRs created by LLMs. I’ve always been a huge proponent of the idea that given the right structures and flow, change should be cheap - if you have the financial means to spin up a few instances of Claude Code with a large budget, it’s quite amazing what you can accomplish with an agent + worktrees… which leads me to the next point.

Black box

I was chatting with another buddy, Austen (who was briefly in town), and an interesting point came up while we were discussing LLMs - Is it important we ‘care’ about LLM outputs? i.e., can we treat them like black boxes? or do we need to understand what’s going on under the hood?

I argue that in the long term, we will not and should not care; it’ll be yet another level of abstraction - albeit, a fuzzy one. Similar to how most of us don’t understand the optimizations LLVM is performing (nor do we care), over time, LLMs should expose a more accessible natural language-esque entrypoint to democratizing software development.

In the near / medium term, we should care as they’re a bit finnicky. Keep up with SWE bench and other benchmarks to understand what SOTA wrt our industry. It can help you answer “should I use an LLM for this”-like questions.

As an industry, I think we do a lot of unnecessary gatekeeping / mystifying - we’ve always embraced abstractions that make development more accessible within our industry, LLMs should be the next evolution in that journey.

Cursor

I might be in the minority, but I don’t think Cursor is very good - it’s definitely better than the autocomplete of old, but I didn’t think that was very good either.

These tools aren’t incentivized to tell you the specific use-cases that their product would be ideal in; so we should be thoughtful about their use.

Austen mentioned the word ‘magical’ when discussing what it’s like coding with one, and I don’t disagree, but this feeling is blinding us from its real utility. It definitely helps us code faster/better/more, but are we providing value to the user or are we using it to do less work - which is also fine and I am a proponent of - but I think we should be clear about that.

A tale of two modes

Speaking with friends in the industry, we’re using the tools in primarily two ‘modes’:

  1. Assistant Mode: AI as an ultra auto-complete (e.g. with Cursor, but without using their agent)
  2. Agent Mode: AI as a terminal AI agent (e.g. Aider, Claude Code, Codex).

I would be careful to explore both options as I think the end-game is Agent-land - so don’t miss what’s going on over the other side of the pond.

In my own personal projects, I always enable both modes. Assistant mode when I want/need more fine-grained control over the outputs, and Agent mode for any task that I think the LLM can execute on its own.

AI avalanche

There is an avalanche of AI slop coming down the content mountain - just log into LinkedIn to see what I mean. The signal-to-noise ratio has deteriorated to a point where curation fiefdoms will become ever more important.

At any point in the timeline, I think Main Street will always underestimate how the tools can be leveraged - so it’s good to have peripheral knowledge about how far the technology has come. I’ve long advocated for an accessible site that highlights SOTA examples of what the technology can do so that people can start to internalize and think about the content they’re consuming.

As a note, I don’t have a problem with the game that’s being played, I just wished people were more transparent about what’s being done with the tools.

Custom software

I have been using a lot of these tools to create custom software for my personal workflow. I sometimes wonder if this is where we’ll end up - everyone having an infinite amount of custom software tailored to one’s own unique processes and preferences.


If you’ve made it this far again… <3

Some things to think about…

  1. How do you determine when an LLM is the right tool to use for a task?
  2. What mode (assistant / agent) do you find most useful and when?

I’d love to hear about your LLM experiences.

Hope to see y’all next time!

This post was written by yours truly and spellchecked/organized/title completed by AI. I am still trying to figure out a way to be transparent about how much of what I write is being edited by a language model.