AI terms

Understand the language of AI governance, machine learning, LLMs, and the semantic layer, powering responsible and scalable AI.

AI lifecycle & engineering

Explore the end-to-end journey of AI model development — from data preparation and training to monitoring and continuous improvement.

AI Pipeline

An AI pipeline automates the end-to-end workflow of an AI project — from data ingestion to training, evaluation, deployment, and monitoring.
Feature Engineering

Feature engineering involves selecting, creating, or transforming input variables (features) to improve a model’s ability to learn patterns from data.
Machine Learning (ML)
Machine learning is a subset of AI where algorithms learn from data to make predictions or decisions without being explicitly programmed. It powers applications like fraud detection, recommendations, and forecasting.
- ? Want to go deeper? Check out:
  - https://www.datagalaxy.com/en/blog/data-preparation-for-machine-learning/
Training Data

Training data is the labeled dataset used to teach a machine learning model how to make predictions. Its quality and structure directly impact model performance.
Model Deployment

Model drift occurs when a model’s performance degrades over time due to changes in the input data or the real-world environment. It requires continuous monitoring.
Model Drift

Model monitoring tracks the performance, accuracy, and fairness of deployed models — helping teams catch drift, bias, or performance issues in real-time.
Model Lineage

Model lineage tracks the full lifecycle of a model — from data sources and training steps to deployments and updates. It enables auditability, reproducibility, and trust in model-driven decisions.
Model Monitoring

Model monitoring tracks the performance, accuracy, and fairness of deployed models — helping teams catch drift, bias, or performance issues in real-time.
Model Registry

A model registry is a centralized system for managing versions of machine learning models, including metadata, approval stages, and deployment status. It ensures traceability, collaboration, and lifecycle control.
Model Training

Model training is the process where an algorithm learns from historical data by adjusting parameters to minimize prediction error.
Model Versioning

Model versioning keeps track of iterations and changes to machine learning models — helping teams compare performance and roll back if needed.

AI governance & risk

Understand how policies, frameworks, and safeguards ensure responsible AI development, transparency, and regulatory compliance.

Accountability in AI

Accountability in AI ensures that clear roles, responsibilities, and ownership are defined across the AI lifecycle — from data sourcing to model deployment and audit.
AI Governance
AI governance refers to the framework of policies, practices, and regulations that guide the responsible development and use of artificial intelligence. It ensures ethical compliance, data transparency, risk management, and accountability—critical for organizations seeking to scale AI securely and align with evolving regulatory standards.
- ? Want to go deeper? Check out:
  - https://www.datagalaxy.com/en/blog/building-an-ai-governance-framework/
  - https://www.datagalaxy.com/en/blog/ai-governance-framework-considerations/
AI Literacy

AI literacy refers to the ability of individuals across an organization to understand, interpret, and effectively engage with artificial intelligence systems. This includes knowing the capabilities and limitations of AI, ethical implications, and how AI fits into broader data governance and business contexts.
AI Risk Management

AI risk management involves identifying and mitigating risks introduced by machine learning models — including bias, drift, compliance breaches, and reputational harm. It’s essential for safe and scalable AI adoption.

? Learn how DataGalaxy enables AI Governance
Bias in AI

Bias in AI refers to systematic errors that unfairly favor certain groups over others — often introduced through biased training data or flawed assumptions in model design.
Compliance (AI Act, GDPR for AI)
AI compliance refers to adhering to evolving regulations (e.g., EU AI Act, GDPR) that govern how AI systems are developed, validated, documented, and monitored — especially when sensitive data or high-risk use cases are involved.
- ? Want to go deeper? Check out:
  - https://www.datagalaxy.com/en/blog/data-governance-for-new-eu-ai-act-compliance/
Ethical AI

Ethical AI is about aligning AI systems with ethical principles like respect for human rights, dignity, safety, and fairness — going beyond compliance alone.
Explainability (XAI)

Explainability is the ability to understand and communicate how an AI system makes decisions — essential for trust, accountability, and debugging complex models.
Fairness

Fairness ensures AI systems do not discriminate or produce unequal outcomes based on protected attributes such as race, gender, or age.
Responsible AI
Responsible AI refers to the practice of building and deploying AI systems that are ethical, transparent, inclusive, and aligned with societal values. It spans fairness, bias mitigation, privacy, and accountability.
- Curious how we turn governance into real-world impact?
Transparency

Transparency in AI involves making the design, intent, data sources, and logic of AI systems visible and understandable to stakeholders — including regulators, users, and developers.

Metadata & semantic layer for AI

Learn how metadata, schemas, and semantic layers structure and standardize AI inputs — enabling interpretability, consistency, and scale.

AI Observability

AI observability is the ability to monitor, debug, and trace how AI systems behave in real-world environments — across data inputs, model decisions, and user interactions.
Data Labeling

Data labeling is the process of assigning metadata or categories to raw data — such as tagging images, classifying text, or identifying entities — to make it usable for training supervised ML models.
Data Provenance

Data provenance tracks the full history of a data asset — where it originated, how it was transformed, and who touched it — offering traceability for audits, compliance, and trust.
Data Readiness
Data readiness measures how prepared your data is to support AI and analytics — across completeness, structure, quality, documentation, and business meaning..
- ? Want to go deeper? Check out:
  - https://www.datagalaxy.com/en/blog/data-readinessdata-governance/
Knowledge Graph

A knowledge graph represents data as entities and relationships — connecting concepts in a network. It enables semantic search, AI readiness, and deeper business context.
ML Metadata

ML metadata refers to the data that describes machine learning artifacts — including training datasets, model parameters, evaluation metrics, and deployment details. Managing this metadata is key to reproducibility and operational visibility.
Prompt Engineering

Prompt engineering is the practice of designing, testing, and optimizing inputs (prompts) for large language models (LLMs) to get consistent, relevant, and useful outputs.
Semantic Layer
The semantic layer sits between raw data and users — translating technical structures into consistent business terms and metrics. It enables clarity, reuse, and self-service analytics.
- ? Want to go deeper? Check out:
  - https://www.datagalaxy.com/en/blog/utilizing-the-semantic-layer/

Generative AI & LLMs

Get familiar with the concepts behind large language models and generative AI — including training data, prompt engineering, and AI product design.

Context Window

A context window defines how much information (tokens) an LLM can “see” at one time during inference. It limits how much data can be used in a single prompt, affecting performance in long documents or multi-turn conversations.
Embedding

An embedding is a numerical representation of a word, sentence, or document that captures its meaning and relationships in a multi-dimensional space. Embeddings allow LLMs to perform similarity search, clustering, and semantic reasoning.
Fine-tuning

Fine-tuning is the process of taking a pre-trained foundation model and retraining it on domain-specific data to improve performance on targeted tasks — balancing general language skills with organizational context.
Foundation Model

A foundation model is a pre-trained AI model that serves as a base for multiple downstream tasks. Trained on broad, unlabeled datasets at scale, it can be adapted via fine-tuning or prompting for use cases like chatbots, search, and document analysis.
Hallucination (in AI)

Hallucination occurs when an AI system generates information that is syntactically plausible but factually incorrect or entirely made up. It is a major challenge in deploying LLMs for business-critical or regulated use cases.
Inference

Inference is the process of using a trained machine learning or AI model to make predictions or generate outputs based on new input data — such as prompting an LLM to summarize a document or answer a question.
Large Language Model (LLM)

A Large Language Model is an advanced neural network trained on massive text corpora to understand and generate human-like language. LLMs like GPT, Claude, or PaLM can perform a range of tasks, from summarization to code generation, using natural language prompts.
Retrieval-Augmented Generation (RAG)

RAG is an AI architecture that enhances a language model’s answers by retrieving relevant documents from an external knowledge base and feeding them into the prompt — improving accuracy, grounding, and freshness of the generated response.
Zero-shot / Few-shot Learning

Zero-shot learning allows models to generalize to tasks they haven’t been explicitly trained on, using only the prompt. Few-shot learning improves accuracy by including a handful of examples in the prompt, guiding the model’s response behavior.

Product Update February

Explore DataGalaxy Catalog

Explore DataGalaxy Portfolio

6 most popular data lineage use cases for businesses

6 steps to develop your data governance framework

AI terms

AI lifecycle & engineering

AI governance & risk

Metadata & semantic layer for AI

Generative AI & LLMs