Thoughts on the future of software development

Guy Waldman

Guy Waldman


Today, GitHub announced GitHub Copilot which is essentially AI-powered code auto-completion; intellisense on steroids, if you will.
What Copilot is, as the name suggests, is the ultimate pair-programmer you always wished for; for example, you start typing a function signature and it auto-completes the rest of the function.

From https://copilot.github.com
From https://copilot.github.com

I was actually privileged enough to be included in the beta, but due to lack of free time these past few weeks I did not try it out. Boy, was I missing out!

So what's under the hood? Well, GitHub Copilot is powered by GPT-3, a sophisticated artificial intelligence model by OpenAI. And what is GPT-3? Well, it is an autoregressive language model that utilizes deep learning. In layman terms - it is able to predict text, and is trained using a neural network. You can "program" the model using examples (think: Q&A) after which, given a new and perhaps unfamiliar input, it will generate a predicted answer according to the trained dataset. For example, you could train GPT-3 with examples containing made-up arithmetic expression with emojis as operators and their answers, and given a new such expression it will do its best to generate an answer based on the examples you gave it.
Microsoft has gained exclusive rights to GPT-3 (meaning, even if others may have access to the API - only it can modify it) and has leveraged this to train GPT-3 on, I assume, as much source code as the internet could supply.

People have used GPT-3 for a lot of creative applications, a particular one which jumps to mind is someone training the model to generate a web application based on received instructions in human language - see this tweet.
GPT-3 is amazing (and even occasionally scary) at picking up on patterns you couldn't see with the naked eye.

However, as with anything, there are downsides. Notably:

  1. This field is still young and emerging, and some of the predictions are not quite there yet
  2. The predictions don't necessarily optimize for readability, performance, maintainability or security - humans are better at this, and probably will be for the foreseeable future
  3. The predictions don't necessarily consider the context in which they are written (e.g. other included files in the project)

Therefore, if using these tools please carefully audit the code that is generated. As mentioned, even if the solution works, it is still trained on source code (and I assume mostly open source), so if you're not careful, you may well end up with unmaintainable code, subtle bugs, security vulnerabilities, etc.

Note that while Copilot is the new kid in the block, it is not the only such product. One notable contender is TabNine (and also Codota, which TabNine acquired in 2020 and are now a single product). I have personally used TabNine and can attest that it felt like magic at times, as does Copilot now. With time, I am certain these products will only improve and introduce new innovations. There is also Kite. Another honorable mention, though aimed at symbol completion rather than completion of entire snippets of code, is Intellicode. This is also a tool I used, and it does come in handy at times.

By the way, Hebrew speakers - Eran Yahav, the CTO of TabNine (among other things), was a guest on a popular Israeli podcast around software development a while back - they discussed the integration of AI in our development workflows and it is very relevant to this blog post. Check it out here, highly recommended.

Now, to all of you reading this right now scared for their jobs - don't be! If anything, I believe this will only make our lives easier. I believe the days of replacing humans with fully automated software craftsmanship is still a distant fantasy (if at all) - but, AI-assisted programming and having a digital pair-programmer guiding you towards the desired output is what I believe to be the future. Perhaps for delicate and maybe even convoluted logic, it may struggle; however, for mundane tasks this could be a serious productivity boost.

Don't be this guy. We're good... for now.
Don't be this guy. We're good... for now.

Alright, Now What?

So I think we've established AI-assisted auto-complete is cool beans. Now what?
Well, I think these developer toolings have a lot of potential. As mentioned, especially mundane and repeating tasks could often be made redundant. Seriously, try these tools - you'll see what I mean.
Now, as this is an emerging field, as we gain advancements in research and domain knowledge, these applications will only get better. As one of my favorite YouTube channels Two Minute Papers likes to put - just imagine what will happen two papers down the line! So as mentioned, I'm not only wary but actually very excited for what development platforms will allow in the next couple of years, and I have a vision.

...

Oh, what's the vision, you ask? Right, sorry.

So, what I envision are that no-code/low-code tools (e.g. Wix, Wordpress, Webflow, Bubble) will - in line with their value propositions - not require much code. Astounding, I know!
However, unlike now, when they do I think they could rely on automatically generated code by a sophisticated AI-powered language model (GPT-42 perhaps).

Now, I don't think they will be perfect, even a decade or two down the line; they will probably require tweaking, the alternative might simply be too risky or infeasible. In my opinion, this "tweaking" could be similar to a code review process.

So, consider this flow:

  1. You describe the requirements in natural human language
  2. You provide tests for verifying the logic - either describe them (which admittedly might make this kind of a circular problem), or write them out specifically
  3. The model generates an iteration of code
  4. "Code review" - you comment on the parts where the logic is incorrect, doesn't make sense, doesn't compile etc. These comments will describe why the logic is incorrect, what it should be changed to, and perhaps optionally provide a predicate to verify that the logic is correct
  5. Optionally, manually code sections where you think the AI needs more hand-holding
  6. The model applies the comments from the code review and presents another iteration
  7. You either accept the changes or reject them; for the latter, you would go back to step #3
  8. The requirements, code output, code review changes and the changes that resolve them - are all "fed back" into the system as potential training data to enhance the algorithm and model (which would make valuable insights)

I think there is merit to this - we reverse the equation and instead of having the AI assist us, we assist the AI. We are the reviewers and also the QA, guiding and verifying along the way. Trade this off with authoring the code end-to-end; what I claim is that this trade-off pays off when the model is sophisticated enough.
And when it doesn't, that's what escape hatches are for - opt out and code it yourself. I will agree, for some cases - sure, this might be overkill. But as a high-fidelity integrated automated authoring tool - I believe the ROI is high enough.

So, my friends, perhaps my elevator pitch to you (and a long elevator ride, at that) did not convince you. That's okay! It's not feasible now anyway.
However, just you wait. In a few years - heck, it might just be crazy enough to work.


Like this post? Have any comments? Tweet at me!