Twelve basic concepts for understanding Generative Artificial Intelligence

27 June 2025
8 minutos
Blog

In recent years, few technologies have had a greater impact on our lives than Generative Artificial Intelligence. Tools such as ChatGPT or Google Gemini have "slipped" into the daily lives of millions of people around the world, revolutionising the way they work, study, communicate or search for information on the Internet.

From assistants that make it easier to write an email, to tools that speed up code generation, or help in decision-making, Generative AI is changing our relationship with technology, perhaps forever.

In this context, however, it is important not to forget that this intelligence is neither "magic", nor is it really "intelligent". It is a technology that hides technical concepts that must be understood in order to use it critically. In this article, we have compiled twelve essential terms that will allow us to take a different look the next time we ask ChatGPT a question.

LLM (Large Language Model)

An LLM (Large Language Model) is an Artificial Intelligence model designed to process and generate natural language. It is trained on huge amounts of data (books, articles, websites, etc.) to learn linguistic patterns, meanings and language structures.

Thanks to this training, an LLM is able to answer questions, write texts, translate languages, summarise content or even hold coherent conversations.

These models are often based on large neural network architectures that analyse the relationships between words by considering the entire context of a sentence in a paragraph from the beginning. LLMs are the basis of many of today's generative AI tools, such as ChatGPT, Google Gemini or Claude.

Importantly, LLMs do not think or understand like a human being. Their responses do not come from conscious intention, but from a statistical calculation of which word or phrase is most likely to follow another based on context.

Prompt

A prompt is the instruction, question or text provided to a language model with the goal of eliciting a response. It is, in essence, the starting point of the conversation or task you want the model to perform. It can be as simple as one word ("summary") or as complex as a paragraph describing a scenario, a role, a response style and a desired format.

The quality and clarity of the prompt directly influences the quality of the response generated. Therefore, knowing how to write good prompts is a key skill in the use of generative AI. This practice is known as prompt engineering and consists of designing the optimal inputs to obtain specific and useful results from the model.

Parameter

In an AI model, the parameter is a numerical value that is part of the neural networks and is "in charge" of identifying patterns in the data, such as the meaning of a word, the relationship between sentences or the tone of a conversation.

There can be billions of parameters in an LLM. The more parameters a model has, the greater its ability to capture nuances in language and generate more accurate responses. GPT-4 or the most popular model in ChatGPT has approximately 1.8 billion parameters, while Llama-4, developed by Meta, is already up to 2 billion.

However, more parameters also means more computational resources and training data.

Token

A token is the minimum unit of text that the model uses to process and generate language. The curious thing is that a token does not always correspond to a complete word: it can be a syllable, a word root, a letter or even a punctuation mark, depending on the language and the model's encoding system.

LLMs do not read and write text directly like humans, but convert text into sequences of tokens in order to operate on them, which in turn determines how we can interact with the model. For example, the tokenised equivalent of the sentence ""UDIT students design amazing projects" could be: ["Los", " estudiantes", " de", " U", "DIT", " dise", "ñan", " proyectos", " increíbles", "."].

It should be noted in this regard that the usage or capacity limits of each model are measured in tokens (e.g., 4,000 or 100,000 maximum tokens per input) and that each token takes up memory and determines both response time and computational cost.

Companies using the APIs of these large models typically pay a fee based on both the number of tokens they send in their request (input) and the number of tokens they receive in response (output).

Hallucination

In the field of Generative AI, a hallucination refers to an answer that appears coherent and credible but is in fact incorrect, made up or unsupported by real data.

LLMs do not fact-check: they simply predict the next most likely word based on context. This means that if they do not have accurate information or if they are asked something ambiguous, they may generate an answer that "sounds good" but is not true.

Despite efforts to limit their impact, hallucinations remain one of the main challenges facing the professional application of generative AI, especially in contexts such as medicine, law or education, where veracity is critical.

Bias

Bias refers to the tendency of an AI model to reflect (and sometimes amplify) inequalities, stereotypes or imbalances present in the data it has been trained on.

These biases can arise unintentionally if the training data contains patterns of discrimination, exclusionary language or disproportionate representation of certain groups or dominant ideologies. Identifying and combating these biases is one of the biggest challenges for the development of responsible AI.

SML (Small Language Model)

An SML or Small Language Model is a natural language model similar to an LLM, but with a much smaller architecture in terms of size and number of parameters.

It is designed to offer text comprehension and generation capabilities with lower resource consumption and is easier to train or adjust to develop specific tasks or to adjust to the needs of a specific sector or company.

Its advantages include the fact that it can be run locally (on local devices or own servers) without relying on large cloud infrastructures, making it more appropriate for safeguarding data privacy.

Agent

An agent is a language model-based system that not only generates text, but can also make decisions and execute actions to accomplish a specific task.

Unlike a chatbot that responds in isolation, an agent has defined goals and can plan the necessary steps to achieve them, interacting with other tools, services or external information sources. This allows it, for example, to search for data, execute commands or keep track of the actions performed.

Thanks to this capacity for autonomous action, agents are increasingly being used in tasks such as customer service, technical assistance, agenda management, data analysis, etc. They integrate language models with programming logic, access to APIs and, in some cases, contextual memory.

Training

The process by which the model learns to generate coherent and contextual language from large volumes of data. During this phase, the model adjusts billions of internal parameters to recognise patterns, word relationships and grammatical structures in order to accurately predict which word (or token) should appear next in a given sequence.

This process is carried out using powerful infrastructures (often employing thousands of GPUs) and can take weeks or months, depending on the size of the model and the data available to "feed" it.

Throughout the training, the system does not memorise exact data, but extracts statistical representations of the language. There are two key stages in this process: pre-training (where the model is trained in a general way with diverse data) and fine-tuning, where the model is refined with specific data or on specific tasks to improve its performance in specific contexts.

Inference

Inference is the process by which a trained model generates an answer or makes a prediction from an input provided by the user. In the case of a language model, inference occurs when, after typing a question, the model analyses the tokens needed to generate the most likely sequence of text as an answer.

Unlike training, which is costly and time-consuming, inference occurs in real time and is what the user experiences when interacting with an AI. This process can be done in the cloud, on specialised servers, or locally for smaller models.

Distillation

Distillation is a technique that allows knowledge to be transferred from a large, complex model to a smaller, faster and more efficient model without losing too much accuracy.

This process works by training the smaller model (called the student) to mimic the outputs of the larger model (the teacher), rather than training it directly on the original data.

In this way, the distilled model learns to replicate the behaviour of the original, but with less resource usage, making it more suitable for deployment in mobile devices, hardware-constrained environments or real-time applications.

AGI (Artificial General Intelligence)

AGI, or Artificial General Intelligence, is a hypothetical form of Artificial Intelligence capable of performing any cognitive task that a human being can do. Unlike current systems that are designed for specific tasks (e.g. generating text, translating languages or answering questions), an AGI would have general reasoning capabilities, similar (or perhaps superior) to those of a human being.

An AGI would not only process information, but would be able to understand, learn from experience, transfer knowledge between different areas of expertise and act autonomously in a variety of environments.

Although it does not yet exist - and is unlikely to ever materialise - it is the horizon towards which many AI research efforts are aiming.

Tesla and Optimus: the vision of Elon Musk

What is the difference between human-centred robotics and industrial robotics?

What is service robotics?