Glossary of Selected AI Terms

Sandy Sanbar
Jul 7, 2024
9 min read

Algorithm

This is a set of rules that a machine can follow to learn how to do a task. It is a step-by-step process for solving a problem. Think of it as a recipe, where each step is clearly laid out to achieve a desired outcome. In the context of artificial intelligence, algorithms drive the logic behind software applications. This enables them to process data, make decisions, and produce results in a consistent manner.

Artificial intelligence (AI)

AI refers to the general concept of machines acting in a way that simulates or mimics human intelligence. It is the simulation of human intelligence by machines that are programmed to think and perform tasks like humans. It encompasses a broad range of technologies and techniques that enable machines to sense, comprehend, act, and learn from experience.
The primary aim of AI is to create systems that are able to perform tasks that normally require human intelligence, for example, recognizing speech, making decisions, interpreting visual input and understanding language.
Over the years AI has evolved to include various subfields, ranging from machine learning and deep learning to robotics and natural language processing, all of which have use cases within healthcare.

Autonomous

An AI machine is described as autonomous if it can perform its task or tasks without needing human intervention.

Backward chaining

A method where the AI model starts with the desired output and works in reverse to find data that might support it.

Bias (in AI)

Bias refers to systematic errors in predictions due to underlying factors in the training data. For example, if the training data contains built-in prejudices or is not representative of the broader population or scenario, the AI model’s output can perpetuate or magnify these biases. Most supervised machine learning models perform better with low bias, as these assumptions can negatively affect results.

Big data

Datasets that are too large or complex to be used by traditional data processing applications.

Bounding box

This is commonly used in image or video tagging; this is an imaginary box drawn on visual information. The contents of the box are labeled to help an AI model recognize it as a distinct type of object.

Chatbot

A chatbot is a program that is designed to communicate with people through text or voice commands in a way that mimics human-to-human conversation.
Chatbots are AI systems that are engineered to mimic conversations with human users. In the healthcare sector, AI-driven chatbots could serve important roles such as arranging appointments, providing symptom assessments, and offering medication reminders.

Classification

This is a type of task where the goal is to predict which category or class an input belongs to.

Cognitive computing

This is effectively another way to say artificial intelligence. It’s used by marketing teams at some companies to avoid the science fiction aura that sometimes surrounds AI.

Confusion matrix

This is a table used to evaluate the performance of a classification model.

a. True positive: The number of times the model correctly predicted “positive”

b. True negative: The number of times the model correctly predicted “negative”

c. False positive: The number of times the model incorrectly predicted “positive”

d. False negative: The number of times the model incorrectly predicted “negative”

Potential uses of a confusion matrix in medicine include:

Disease Detection: AI models might be trained to detect the presence or absence of a disease based on medical images, lab results, or other diagnostic data. The confusion matrix can help determine how often the model correctly identifies patients with the disease (True Positives) versus misdiagnosing healthy individuals as having the disease (False Positives), and so on. For example, this JAMA study used a confusion matrix to help evaluate the ability of AI to distinguish Kawasaki Disease from other causes of fever in children brought to a pediatric emergency department.
Treatment Recommendation: AI might be used to recommend whether a patient should receive a certain treatment or not. In such cases, a confusion matrix can be used to understand how often the AI’s recommendations align with expert opinions or outcomes.

Predictive Modelling

AI can be used to predict patient outcomes, such as whether a patient will be readmitted to a hospital within 30 days. The confusion matrix can show how accurate these predictions are.

Computational learning theory

A field within artificial intelligence that is primarily concerned with creating and analyzing machine learning algorithms.

Corpus

This denotes a large dataset of written or spoken material that can be used to train a machine to perform linguistic tasks.

Data mining

The process of analyzing datasets in order to discover new patterns that might improve the model.

Dataset

A collection of related data points, usually with a uniform order and tags.

Data science

Drawing from statistics, computer science and information science, this interdisciplinary field aims to use a variety of scientific methods, processes and systems to solve problems involving data.

Deep learning

A function of artificial intelligence that imitates the human brain by learning from the way data is structured, rather than from an algorithm that’s programmed to do one specific thing.
Deep learning is a subset of machine learning, inspired by the human brain’s structure and function, specifically, neural networks. It involves algorithms known as artificial neural networks, which can process enormous amounts of data and automatically extract patterns and features from it. These networks are ‘deep’ because they consist of multiple layers of interconnected nodes, allowing them to make sophisticated decisions.
See, a paper from the Journal of the American College of Cardiology which discusses an example of deep learning in cardiology which may make it possible to automate echocardiogram analysis and predict disease on a large scale.

Entity annotation

The process of labeling unstructured sentences with information so that a machine can read them. This could involve labeling all people, organizations and locations in a document, for example.

Entity extraction

An umbrella term referring to the process of adding structure to data so that a machine can read it. Entity extraction may be done by humans or by a machine learning model.

Forward chaining

A method in which a machine must work from a problem to find a potential solution. By analyzing a range of hypotheses, the AI must determine those that are relevant to the problem.

General AI

AI that could successfully do any intellectual task that can be done by any human being. This is sometimes referred to as strong AI, although they aren’t entirely equivalent terms.

Generalization

This refers to an AI model’s capacity to apply its learning from one set of data to new, unfamiliar data. It measures how well an AI model can predict or make decisions in real-world scenarios that it wasn’t specifically taught to handle.

Hyperparameter

Occasionally used interchangeably with parameter, although the terms have some subtle differences. Hyperparameters are values that affect the way your model learns. They are usually set manually outside the model.

Intent

Commonly used in training data for chatbots and other natural language processing tasks, this is a type of label that defines the purpose or goal of what is said. For example, the intent for the phrase “turn the volume down” could be “decrease volume”.

Label

A part of training data that identifies the desired output for that particular piece of data.

Large language models (LLMs)

These are advanced algorithms designed to process, generate, and understand human language on a vast scale. They are trained on large amounts of text (data) from diverse sources, enabling them to generate coherent sentences, answer questions, and even produce content resembling human-written prose.

Linguistic annotation

Tagging a dataset of sentences with the subject of each sentence, ready for some form of analysis or assessment. Common uses for linguistically annotated data include sentiment analysis and natural language processing.

Machine intelligence

This is an umbrella term for various types of learning algorithms, including machine learning and deep learning.

Machine learning

This subset of AI is particularly focused on developing algorithms that will help machines to learn and change in response to new data, without the help of a human being. Machine learning refers to a computer system using algorithms to learn and improve from experience. The algorithms analyze data to recognize patterns, and use those to make choices and/or predictions. The more data system processes, the more accurate its outputs become, which can allow it to tackle complex tasks that used to require human judgement.[2]

Machine translation

The translation of text by an algorithm, independent of any human involvement.

Model

A broad term referring to the product of AI training, created by running a machine learning algorithm on training data.

Natural language generation (NLG)

This refers to the process by which a machine turns structured data into text or speech that humans can understand. Essentially, NLG is concerned with what a machine writes or says as the end part of the communication process. Natural language processing (NLP) is a subfield of AI that focuses on the interaction between computers and humans through natural language. This technology is a key component of large language models like ChatGPT, Claude, and Gemini.

Natural language processing (NLP)

The umbrella term for any machine’s ability to perform conversational tasks, such as recognizing what is said to it, understanding the intended meaning and responding intelligibly.

Natural language understanding (NLU)

As a subset of natural language processing, natural language understanding deals with helping machines to recognize the intended meaning of language — taking into account its subtle nuances and any grammatical errors.

Neural network

This is a type of machine learning model inspired by the human brain, composed of interconnected nodes or virtual ‘neurons.’ Neural network is also called a neural net, a neural network is a computer system designed to function like the human brain. Although researchers are still working on creating a machine model of the human brain, existing neural networks can perform many tasks involving speech, vision and board game strategy.

Overfitting

This is an important AI term; overfitting is a symptom of machine learning training in which an algorithm is only able to work on or identify specific examples present in the training data. A working model should be able to use the general trends behind the data to work on new examples.

Parameter

This is a variable inside the model that helps it to make predictions. A parameter’s value can be estimated using data and they are usually not set by the person running the model.

Pattern recognition

The distinction between pattern recognition and machine learning is often blurry, but this field, pattern recognition, is basically concerned with finding trends and patterns in data.

Predictive analytics

By combining data mining and machine learning, this type of analytics is built to forecast what will happen within a given timeframe based on historical data and trends.

Prompting

This refers to giving an AI system a specific instruction or question. Prompts can be as short or long as need be. But keep in mind that the more specific the prompts are, the more likely it is that the AI’s output will be what you’re looking for.

Python

A popular programming language used for general programming.

Reinforcement learning

This is a method of teaching AI that sets a goal without specific metrics, encouraging the model to test different scenarios rather than find a single answer.

Based on human feedback, the model can then manipulate the next scenario to get better results.

Reinforcement learning is a type of machine learning where an AI learns how to behave in an environment by performing actions and receiving rewards and/or punishments. For example, reinforcement learning was used to train an AI to detect skin cancer.

Semantic annotation

Tagging different search queries or products with the goal of improving the relevance of a search engine.

Sentiment analysis

The process of identifying and categorizing opinions in a piece of text, often with the goal of determining the writer’s attitude towards something.

Strong AI

This field of research is focused on developing AI that is equal to the human mind when it comes to ability. General AI is a similar term often used interchangeably.

Supervised learning

This is a type of machine learning where structured datasets, with inputs and labels, are used to train and develop an algorithm. The AI learns from a dataset where each example is paired with the correct output.

Test data

The unlabeled data used to check that a machine learning model is able to perform its assigned task.

Training data

This refers to all of the data used during the process of training a machine learning algorithm, as well as the specific dataset used for training rather than testing. Its the data used to teach an AI model.

Transfer learning

This method of learning involves spending time teaching a machine to do a related task, then allowing it to return to its original work with improved accuracy. One potential example of this is taking a model that analyzes sentiment in product reviews and asking it to analyze tweets for a week.

Turing test

Named after Alan Turing, famed mathematician, computer scientist and logician, this tests a machine’s ability to pass for a human, particularly in the fields of language and behavior. After being graded by a human, the machine passes if its output is indistinguishable from that of human participant’s.

Unsupervised learning

This is a form of training where the algorithm is asked to make inferences from datasets that don’t contain labels. These inferences are what help it to learn.

Validation data

Structured like training data with an input and labels, this data is used to test a recently trained model against new data and to analyze performance, with a particular focus on checking for overfitting.

Variance

The amount that the intended function of a machine learning model changes while it’s being trained. Despite being flexible, models with high variance are prone to overfitting and low predictive accuracy because they are reliant on their on their training data.

Variation

Also called queries or utterances, these work in tandem with intents for natural language processing. The variation is what a person might say to achieve a certain purpose or goal. For example, if the intent is “pay by credit card,” the variation might be “I’d like to pay by card, please.”

Weak AI: Also called narrow AI, this is a model that has a set range of skills and focuses on one particular set of tasks. Most AI currently in use is weak AI, unable to learn or perform tasks outside of its specialist skill set.

References

[1] 50 AI Terms Every Beginner Should Know | TELUS International

[2] Tsai CM, Lin CR, Kuo HC, Cheng FJ, Yu HR, Hung TC, Hung CS, Huang CM, Chu YC, Huang YH. Use of Machine Learning to Differentiate Children With Kawasaki Disease From Other Febrile Children in a Pediatric Emergency Department. JAMA Netw Open. 2023 Apr 3;6(4):e237489.

Glossary of Selected AI Terms

Recent Posts

Comments