Paradigm Junction

Lots of businesses we speak to want to harness the power of GenerativeAI, but integrating Large Language Models (like the one ChatGPT is based on) into their work flows presents at least as many problems as solutions.

‍

Where does your data go?

OpenAI claims they “will not use data submitted by customers via our API to train or improve our models, unless you explicitly decide to share your data with us for this purpose”. Notably absent from this statement, however, is data that is entered into the ChatGPT interface, which is believed to be retained for 30 days.

For most businesses, this level of assurance is unlikely to be sufficient to hand over commercially valuable or proprietary data. The risk of it being shared further is too high. Where customer data is concerned, sharing this information with OpenAI may represent a breach of business’ own safeguarding obligations. For particularly sensitive data, such as recorded voice, or financial or health records, sharing this via ChatGPT would represent a serious confidentiality breach.

A key concern for all managers we have spoken to is understanding what their teams are currently doing and how to set policies to determine what can and can’t be put into ChatGPT.

That said, GenAI systems, including ChatGPT offer such profound productivity improvements, for everything from customer support to interrogating internal policies and historical decisions, that companies are keen to ask: how can we have a version of ChatGPT that works for us, without exposing information we shouldn’t.

‍

Why might you build a proprietary Large Language Model yourself?

Whilst still requiring the efforts of a contractor with ML Engineering experience, building, or adapting in house, offers a number of key benefits:

‍

Keep your data private

Whether the data needs to be retained for legal, trust or purely commercial reasons, a bespoke language model can give you assurance that any prompts you enter stay firmly within your business. This is crucial if building LLM powered applications to respond to clients and would give managers confidence that employees can use GenAI tools for productivity gains, without risking a data leak.

‍

Make it learn about your business

GPT4 has impressive general intelligence, but knows nothing of the specifics of your work (unless these details are readily to be found on Wikipedia or Stackoverflow!). It is like a highly trained graduate turning up for their first day of work - every single day. This dramatically reduces the utility, compared to a language model which could understand how we do things around here. Whether this is a knowledge of the org chart to correctly direct customer queries, or appreciation that a project like this was tried before when writing a strategy presentation, local, situational knowledge is key to extracting competitive advantages with these tools.

‍

Spend less - significantly less

For the first time in many years running software now has appreciable marginal costs. Whilst not noticeable for casual ChatGPT use, as soon as you build an application on top of the OpenAI API you will start receiving bills. Each token generated shows up as a bill at the end of the month. If you are automating a high volume service, such as an AI customer service agent (or assistant to a human agent) these costs will be significant.

This arises because each response is generated by running inference calculations across all of GPT4’s 100 trillion parameters (or GPT3’s 175bio). Models this big are necessary for flexible, general AI assistants, but for specific tasks within a business a much smaller model can give just as good results at a fraction of the computing cost.

‍

Fortunately, with technical advances of the past few months, building a solution specifically for you is now within reach of many businesses. There are a range of options, depending on your needs and budget, including:

‍

Downloading a compressed LLM (e.g. GPT4All) and running it locally - fully offline (or “on-prem”)
Fine tuning an existing open source model to your own business context and data
Training a new model from scratch

‍

We will explore the pros, cons and costs of each of these approaches in the next article in this series.

‍

Meanwhile, if you’d like to discuss how any of these topics affect your business specifically get in touch with james@paradigmjunction.com

‍

How to get the benefits of ChatGPT without sending your commercial data to OpenAI

Related posts

Privacy Policy

AI Agents and navigating the web

AI Investment and Sovereignty: Rethinking What Counts

How an AI workshop can unlock adoption across your office