BusinessAI, utility comes from the predictability in generating valid results, select the right model

While first I defined artificial intelligence as a black boxes, inside them there are thousands and tens of thousands lines of code, which we do not have time (in human terms) to read.

Utility comes from the predictability of a good result. Firstly, we need good data to prepare it, then we do the fine tuning and training until we achieve and evaluate as successful the obtained output. Then we have to follow some structured methodology to choose the right AI solution, train them to our purposes and evaluate the results. Those more specific tools are “models”.

Stages of work and complexity, and evaluation of the models:

Prompt engineering

When the resources are not there to tailor your own solution, you can still take advantage of the rapid advancements in the field using existing Large Language Models, in any case this is a good first approach to understand what is possible.

This is about learning how to ask and give as much context as needed, to get the maximum value and answers. You are able to give them your existing data (limited) and work after it.

Retrieval Augmented Generation

Here we are moving from black boxes to a proprietary more tangible model, narrowing the scope and augmenting predictability of results. The process is about inserting actionable data into the model and obtaining a more precise answera. It seems to me a question and answer Q&A AI. E.g. chat bots.

Fine-Tuning

You take a black box or model that seems to be adequate to your end goal and add a bit of your specific data to get a more accurate result. This black box box has been already trained with loads of data. Choosing the right point to target or layers of the neural network architecture during the adaptation could decide the quality of the output from the fine-tuned model.

Pretraining

You have a model with a set of rules, however yet to be given data inputs that guide future answers. Those inputs are only yours and you need a lot of data resources for this exercise. This option is for those who want to work only on your data if you want to avoid biases from others, data protection, and you own unique data sets not available on the market.

Evaluate the your built applications

Monitor the outputs, check that they remain relevant as they might lose accuracy; sensitivity to prompt variations, the model might perform well while deviating at results due to prompt-orders understanding, i.e. lesser comprehension.

You can use already created Large Language Models to test your results from fine tuning and training. A recommendation is to use small scale grades, e.g. from 1 to 4, as they are way easier to define and understand, than with larger scales like 1 to 10 or 1 to a 100.

Role of humans in AI modelling and training

Training the models is very expensive regarding data processing and computation, so you cannot just try all of them with all your data. You have to accurately choose in advance, humans are needed at this stage to assess and identify the right data sets, models and evaluate even results to spot a viable combination that creates the next best outputs. As well as using previously done research.

The eye of a well trained engineer can save much expenditures and reduce investments by choosing the right model upfront!

Learning resource: DataBricks Big Book of GenAI

logoblocks