July 2, 2024

Computers aren't supposed to be able to do that

Last week I showed a friend a table I’d made with Claude, one of the competitors to ChatGPT, and - slightly to my surprise - it blew his mind. I’d asked which teams in Euro 2024 had made the semi-finals of a major tournament in the last 16 years. Along the way, I had asked Claude to check one of the outputs for accuracy, which it did and consequently made a correction. This sort of “check your work” step is now second nature to me when carrying out factual tasks with AI tools, much as “have you checked you haven’t made any silly mistakes” was my Dad’s favourite refrain when I told him I had finished my homework. But, asking a computer to “have another go” struck my friend as mad, whose two decades of computer usage had taught him that if you put the same thing in, you get the same thing back out. 

“Computers aren’t supposed to be able to do that” 

And, of course, 18 months ago he would have been right. More tellingly, I’d grown so accustomed to this type of working, that to me it had stopped being remarkable they now can. 

Einstein famously said insanity was doing the same thing over and expecting a different result. Working with chatbots that's no longer true.

What I thought would be an interesting conversation about the group stage of Euro 2024 (the table proved categorically that Group B was the group of death) ended up being a far more interesting discussion about the new capabilities that Chatbot applications have developed over the past year. Like many professionals, my friend had tried ChatGPT a year ago when the first wave of hype arrived, decided it wasn’t much use to him, and never saw fit to try again. Yet, over the last 12 months, there have been big advances in the techniques for making models do what you want, the tooling that they can use and the models themselves, especially at the free tier. 

Most people don’t have time (or interest) to closely follow the regular beat of product and process improvements that have been coming out of the AI companies. However, paying no attention risks leaving you blind to what computers - or more accurately your employees using them - can now do. Below is a quick round up of the major changes which might have passed you by, if you haven’t checked in with ChatGPT since the launch last year. The changes are even more drastic than England’s newfound ability to reach major tournament Semi Finals. 

Group B was by far the hardest, which is what I was trying to prove before we got distracted by the AI conversation

New workflows have become second nature 

Learning to ask for what you want in a way that gets the desired response is something that doesn’t necessarily come naturally to people, much as managing junior employees for the first time can take some learning. Prompt templates have sought to codify best practices, but writing one really clever set of instructions might be less helpful than having a conversation with clear steps and improvements. LLMs have no feelings and respond really well to being told to do the same thing again, but with specific changes. A few tips: 

  • Asking for error checking, self-critique and suggested revisions should be a standard step in any workflow. The following step should then ask for the output to be reworked taking these into account (if you agree with the suggestions). 
  • Give instructions directly, simply and expect them to be followed. More like giving commands to a dog, than asking a new team member to do some work. 
  • For complicated tasks, you’ll have more success breaking it down into steps yourself and asking these to be carried out one by 1. For example, for the above table I asked for: 
    • All of the semi finalists in major tournaments since 2008 as a list 
    • This information to be put into a table 
    • The groups for Euro 2024 to be written as a table 
    • These two tables to be combined, to show the number of semi-finalists in each group 
    • The table headings to be adjusted and reformatted

With error checking at each stage. 

The best models are free and much better than the free version of ChatGPT you likely tried 

In May 2024 OpenAI made a version of their flagship model, GPT4, available to everyone for free in ChatGPT. Previously, the model underpinning the free service was an earlier, less powerful version, whilst the best model required a paid subscription. The new model is worlds better at following instructions, avoids reasoning and factual errors more often (though still not perfectly) and is able to handle longer, more complex inputs and outputs. 

As if this wasn’t enough, on 20th June, Anthropic upgraded their chatbot, Claude, to their latest model and made it free for all users, up to a certain number of queries per day. Reviewers have Claude as the clear winner for daily conversation and productivity improvement tasks, despite having a fraction of the user numbers. To ChatGPT might have become a verb, in the way To Google did, but unlike search engines, that doesn’t mean it's the best option out there. It’s well worth trying out this alternative before making your mind up about where these tools can be useful for you. 

Chatbots aren’t just information retrieval or beefed up Google search 

Lots of users use chatbots for a relatively limited range of conversations: information retrieval, writing assistance, homework help and technical support. When ChatGPT was first released that made sense, as it was only able to access information that had been included in training up to its cut off date and even that was only available as a written output. Over the past year, OpenAI, Anthropic and Google have added much more powerful tools which extend the capability of these chatbots and increase the range of tasks you can carry out. 

  • Internet connections mean you can ask for research to be carried out and new information to be added 
  • Inbuilt code editing & execution means you can ask for data analysis, charts & scripts that can copy and run
  • The inbuilt analysis units also allow you to run multi step processes, like manipulating data before presenting the outputs as a table (or something more complicated). 
  • Multimodality (a way of saying that models use more than just text) means you can upload images or Pdfs, with a mixture of text and words, and expect the model to understand what you gave it. You can produce images (either pictures, or graphs) as downloadable outputs too. 

As commentator Ethan Mollick is fond of saying, “Today’s AI is the worst AI you will ever use”. If your reaction to AI tools is that they aren’t useful for you, then I’d encourage you to check what that impression is based on. Is your information up to date? Or have you, like the models, got a knowledge cut-off that is several months old? 

We aren’t used to keeping up with technology that is changing meaningfully in real time. Not since the early generations of the iPhone have technology releases offered consumers and professionals the chance to do something with a computer that they couldn’t imagine doing before. 

In the coming year, the perception of change will continue to be just as fast as products which are already developed start to roll out to users. Desktop versions of ChatGPT will have access to local files, iPhone integrations will provide “smart” summaries of nearly everything, Co-Pilot tools in Word and Google Docs will proactively suggest improvements, even without asking. 

If your reaction to AI tools is “I don’t see any way they could be useful for me” that’s possibly true, but it is worth checking how confident you are that your methods and impressions are up-to-date, and whether they’ll stand the test of time. If the changes described above are new or surprising, I’d encourage you to spend a little time each month experimenting with these tools as they develop and finding a low time, high information way of following which changes matter to you.

Our newsletter does this for organisational leaders - you can subscribe here - or there are lots of good training courses online. Please reach out if you’d like to know more. 

Error checking goes both ways

Related posts

Buying Generative AI in Government

PUBLIC and Paradigm Junction have teamed up to build this practical guide for public buyer and suppliers to effectively navigate the process of 'Buying GenAI'. Examining critical factors of the procurement process - defining scope, securing finance, evaluating suppliers, defining IP, managing contracts - this report provides usable insights into what makes GenAI different and how Government can engage with it confidently.

Introduction to Futures and Foresight for Emerging Technologies

No one is sure about how the capabilities of AI will develop, let alone how business, government and society will respond to them. But this doesn’t mean that you should stand still. Tools exist for helping decision makers make smart choices in the face of uncertainty about the future, where traditional forecasts are liable to falter.

Apple Vision Pro - Seeing The World Through A New Lens

The Vision Pro isn’t only an AR/VR play from Apple - it’s a first bid to equip us with the tools we will want to be using in the future world of work.

Models and Beauty Contests

In the face of a changing world, refusing to act until you can be sure of the outcome is a decision in and of itself. Sometimes wait and see is the correct decision. Other times, it is an invitation to competitors to steal a march.