GitHub Copilot for Vim Review

The impact of Large Language Models (LLMs) on the field of software development is arguably one of the most debated topics in developer circles today, sparking discussions at meetups, in lunchrooms, and even during casual chats among friends. I won't attempt to settle that debate definitively in this post, largely because I lack the foresight required. My track record for predicting the long-term success or failure of new technologies is, frankly, about as accurate as a coin flip. In fact, if I personally dislike a technology, it seems destined to become an industry standard.

However, I do believe I'm well-positioned to weigh in on a much more specific question: Is GitHub Copilot beneficial for me within my primary work environment, Vim? I've used Vim extensively as my main development tool for well over a decade, spending roughly 4-5 hours in it daily, depending on my meeting schedule. My work within Vim involves a variety of technologies, including significant amounts of Python, Golang, Terraform, and YAML. Therefore, while I can't provide a universal answer to whether an LLM is right for you, I can offer concrete opinions based on my direct experience with GitHub Copilot as a dedicated Vim user today.

Testing

So just to prove I really set it up:

It's a real test, I've been using it every day for this time period. I have it set up in what I believe to be the "default configuration".

The Vim plugin I'm using is the official one located here: https://github.com/github/copilot.vim

How (I think) the plugin works

The plugin uses Vimscript to capture the current state of the editor. That includes stuff like:

The entire content of the current buffer (the file being edited).
The current cursor position within the buffer.
The file type or programming language of the current buffer.

The Node.js language server receives the request from the Vim/Neovim plugin. It processes the provided context and constructs a request to the actual GitHub Copilot API running on GitHub's servers. This request includes the code context and other relevant information needed by the Copilot AI model to generate suggestions.

The plugin receives the suggestions from the language server. It then integrates these suggestions into the Vim or Neovim interface, typically displaying them as "ghost text" inline with the user's code or in a separate completion window, depending on the plugin's configuration and the editor's capabilities.

How it feels to use

As you can tell from the output of vim --startuptime vim.log the plugin is actually pretty performant and doesn't add a notable time to my initial launch.

In terms of the normal usage, it works like it says on the box. You start typing and it shows the next line it thinks you might be writing.

The suggestions don't do much on their own. Basically the tool isn't smart enough to even keep track of what it has already suggested. So in this case I've just tab completed and taken all the suggestions and you can tell it immediately gets stuck in a loop.

Now you can use it to "vibe code" inside of Vim. That works by writing a comment describing what you want to do and then just tab accepting the whole block of code. So for example I wrote Write a new function to check if the JWT is encrypted or not. It produced the following.

So I made a somewhat misleading comment on purpose. I was trying to get it to write a function to see if a JWT was actually a JWE. Now this python code is (obviously) wrong. The code is_jwt_encrypted assumes the token will always have exactly three parts separated by dots (header, payload, signature). This is the structure of a standard JSON Web Token (JWT). However, a JSON Web Encryption (JWE), which is what a wrapped encrypted JWT is, has five parts:

Protected Header
Encrypted Key
Initialization Vector
Ciphertext
Authentication Tag

So this gives you a rough idea of the quality of the code snippets it produces. If you are writing something dead simple, the autogenerate will often work and can save you time. However go even a little bit off the golden path and, while Copilot will always give it a shot, the quality is all over the place.

Scores Based on Common Tasks

Reviewing a product like this is extremely hard because it does everything all the time and changes daily with no notice. I've had weeks where it seems like the Copilot intelligence gets cranked way up and weeks where its completely brain dead. However I will go through some common tasks I have to do all the time and rank it on how well it does.

Parsing JSON

90/100

This is probably the thing Copilot is best at. You have a JSON that you are getting from some API and then Copilot helps you fill in the parsing for that so you don't need to type the whole thing out. So just by filling in my imports it already has a good idea of what I'm thinking about here.

So in this example I write the comment with the example JSON object and then it fills in the rest. This code is....ok. However I'd like it to probably check the json_data to see if it matches the expectation before it parses. Changing the comment however changes the code.

This is very useful for me as someone who often needs to consume JSONs from source A and then send JSONs on to target B. Saves a lot of time and I think the quality looks totally acceptable to me. Some notes though:

Python Types greatly improve the quality of the suggestions
You need to check to make sure it doesn't truncate the list. Sometimes Copilot will "give up" like 80% through writing out all the items. It doesn't often make up ones, which is nice, but you do need to make sure everything you expected to be there ends up getting listed.

Database Operations

40/100

I work a lot with databases, like everyone on Earth does. Copilot definitely understands the concepts of databases but your experience can vary wildly depending on what you write and the mood it is in.

I mean this is sort of borderline making fun of me. Obviously I don't want to just check if the file named that exists?

This is better but it's still not good. If there is a file sitting there with the right name that isn't a database, sqlite3.connect will just make it. The except sqlite3.Error part is super shitty. Obviously that's not what I want to do. I probably want to at least log something?

Let me show another example. I wrote Write a method to create a table in the SQLite database if it does not already exist with the specified schema. Then I typed user_ID UUID and let it fill in the rest.

Not great. What it ended up making was even worse.

We're missing error handling, no try/finally blocks with the connection cursor, etc. This is pretty shitty code. My experience is it doesn't get much better the more you use. Some tips:

If you write out the SQL in the comments then you will have a way better time.

CREATE TABLE users (
	contact_id INTEGER PRIMARY KEY,
	first_name TEXT NOT NULL,
	last_name TEXT NOT NULL,
	email TEXT NOT NULL UNIQUE,
	phone TEXT NOT NULL UNIQUE
);

Just that alone seems to make it a lot happier.

Still not amazing but at least closer to correct.

Writing Terraform

70/100

Not much to report with Terraform.

So why the 70/100? I've had a lot of frustrations with Copilot hallucinations with Terraform where it will simply insert arguments that don't exist. I can't reliably reproduce it, but this is something that can really burn a lot of time when you hit it.

My advice with Terraform is to run something like terrascan after which will often catch weird stuff it inserts. https://github.com/tenable/terrascan

However I will admit it saves me a lot of time, especially when writing stuff that is mind-numbing like 1000 DNS entries. So easily worth the risk on this one.

Tips:

Make sure you use the let g:copilot_workspace_folders = ['/path/to/my/project1', '/path/to/another/project2']
That seems to ground the LLM with the rest of the code and allows it to detect things like "what is the cloud account you are using".

Writing Golang

0/100

This is a good summary of my experience with Copilot with Golang.

I don't know why. It will work fine for awhile and then at some point, roughly when the golang file hits around 300-400 lines, seems to just lose it. Maybe there's another plugin I have that's causing a problem with Copilot and Golang, maybe I'm holding it wrong, I have no idea.

There's nothing in the logs I can find that would explain why it seems to break on Golang. I'm not going to file a bug report because I don't consider this my job to fix.

Summary

Is Copilot worth $10 a month? I think that really depends on what your day looks like. If you are someone who is:

Writing microservices where the total LoC rarely exceeds 1000 per microservice
Spends a lot of your time consuming and producing JSONs for other services to receive
Are capable of checking SQL queries and confirming how they need to be fixed
Has good or great test coverage

Then I think this tool might be worth the money. However if your day looks like this:

Spends most of your day inside of a monolith or large codebase carefully adding new features or slightly modifying old features
Doesn't have any or good test coverage
Doesn't have a good database migration strategy.

I'd say stay far away from Copilot for Vim. It's going to end up causing you serious problems that are going to be hard to catch.