Home • Blog • GPT-5.5 in the Enterprise: 10 Use Cases That Go Beyond Chatbots

GPT-5.5 in the Enterprise: 10 Use Cases That Go Beyond Chatbots

21 May 2026

Table of contents

1. Why Is GPT-5.5 Becoming a Serious Enterprise AI Tool?

GPT-5.5 should be evaluated as workflow infrastructure for enterprise AI, not as a better chatbot. OpenAI positions it as a frontier model for complex professional work, with strengths in coding, online research, data analysis, spreadsheets, document creation, software operation, and tool use through the API. That matters because the highest-value enterprise pattern is no longer “ask a question, get an answer,” but “assign a bounded business task, retrieve context, call the right systems, check the output, and route decisions to the right human when risk is material.”

The timing is important. OpenAI says it now serves more than 7 million ChatGPT workplace seats; ChatGPT Enterprise seats have risen about ninefold year over year; weekly Enterprise messages have grown roughly eightfold; and the use of Custom GPTs and Projects has increased about nineteenfold year to date. In the same research, 75% of workers report that AI improves speed or quality, average reported time savings are 40–60 minutes per active day, and 75% say they can now complete tasks they previously could not do. In other words, the enterprise shift is already underway: from ad hoc prompting to repeatable workflows.

For CIOs, CTOs, Heads of Digital, and Heads of Operations, the strategic takeaway is straightforward. The strongest value pools remain customer operations, marketing and sales, software engineering, and R&D, while internal knowledge management can create cross-functional gains across the whole firm. OpenAI’s own enterprise guidance also points leaders toward repeatable “primitives” such as research, coding, data analysis, content creation, and automation, then encourages workflow mapping across whole departments rather than isolated prompts.

A rigor note is necessary. Because GPT-5.5 only became available in the API in late April 2026, longitudinal production data that is specific to GPT-5.5 is still limited. The most defensible evidence base therefore combines official GPT-5.5 documentation with adjacent enterprise case studies using OpenAI systems, academic productivity studies, and operational benchmarks from knowledge-heavy industries.

2. What Are the Best GPT-5.5 Use Cases for Enterprise Teams?

The KPI frames below are designed for business evaluation, not as guaranteed outcomes. The right way to read them is: these are the measures a serious enterprise pilot should baseline before rollout, then track weekly during pilot and monthly in production.

2.1 How Can GPT-5.5 Improve Customer Service Without Becoming Just Another Chatbot?

Typical scenarios: multilingual customer support, intent classification, agent assist, after-call summaries, returns and refund drafting, policy-grounded responses, and smart escalation. Business value and KPIs: containment rate, average handle time, first-contact resolution, repeat-contact rate, SLA attainment, CSAT, and NPS. Technical requirements: helpdesk plus CRM plus order and payment systems, with RAG over policy content and approval gates before any refund or account-changing action. Main risks and mitigation: hallucinated policy answers, poor escalation logic, and unsafe automations; mitigate with retrieved citations, read-only defaults, and human approval for financially material actions. As directional evidence, NBER found AI-guided support increased productivity by nearly 14%, while Klarna reported that its OpenAI-powered assistant handled two-thirds of service chats, cut resolution time from 11 minutes to under 2 minutes, reduced repeat inquiries by 25%, and held customer satisfaction at parity with human agents.

2.2 How Can GPT-5.5 Reduce Internal IT and HR Support Tickets?

Typical scenarios: service desk triage, access and entitlement guidance, onboarding question handling, policy Q&A, software request intake, and benefits or HR process support. Business value and KPIs: ticket deflection, MTTR, backlog, SLA adherence, onboarding cycle time, time-to-productivity, and employee satisfaction. Technical requirements: ITSM, identity provider, HRIS, internal knowledge base, and approval workflows for provisioning or permissions changes. Main risks and mitigation: unauthorized access changes and incorrect policy guidance; mitigate with SSO, RBAC, approval thresholds, and full audit logging. OpenAI’s enterprise report found that 87% of IT workers report faster IT issue resolution and 75% of HR professionals report improved employee engagement when using AI at work.

2.3 How Can GPT-5.5 Turn Enterprise Knowledge Bases into Actionable Answers?

Typical scenarios: policy retrieval, onboarding to a codebase or client account, cross-repository search, summarizing recent decisions, and answering internal process questions with source links. Business value and KPIs: search success rate, time-to-answer, onboarding time, duplicate-ticket reduction, and reuse of institutional knowledge. Technical requirements: Company Knowledge or File Search over permissioned repositories, with sources such as SharePoint, Google Drive, Slack, GitHub, HubSpot, Asana, and other connected apps; answers should always return citations to source material. Main risks and mitigation: stale documentation, source conflicts, and over-trust in low-quality files; mitigate with document ownership, freshness rules, and source-ranking policies. OpenAI says Company Knowledge returns answers with citations and respects existing permissions, while BBVA reports 20,000-plus Custom GPTs across the bank and a Peru assistant that cut some internal query handling from roughly 7.5 minutes to about 1 minute.

2.4 How Can Sales Teams Use GPT-5.5 for Account Research, RFPs and Proposals?

Typical scenarios: account research, meeting preparation, RFP parsing, proposal drafting, CRM summary generation, and personalized outreach preparation. Business value and KPIs: research time per account, proposal turnaround time, seller capacity, meeting prep time, pipeline coverage, and win rate. Technical requirements: CRM, email and calendar data, account notes, proposal templates, and external research sources; outbound content should remain human-reviewed before send. Main risks and mitigation: stale CRM data, fabricated personalization, and brand inconsistency; mitigate with source-grounded prompts, approval workflows, and template libraries. McKinsey identifies marketing and sales as one of the largest value pools for generative AI, and Clay’s OpenAI-powered sales research stack shows the pattern clearly: one system can centralize fragmented GTM data, automate prospect research, and materially expand outreach capacity.

2.5 How Can Finance Teams Use GPT-5.5 for Forecasting, Reporting and Close Processes?

Typical scenarios: monthly close support, variance explanation, spreadsheet modeling, procurement intake, treasury and tax research, board-pack drafting, and contract review support for finance. Business value and KPIs: days-to-close, forecast cycle time, forecast accuracy, variance analysis time, procurement turnaround, cost per transaction, and analyst hours saved. Technical requirements: ERP, procurement systems, spreadsheet tools, data warehouse access, and structured outputs for downstream workflows. Main risks and mitigation: bad accounting logic, control breaks, or unauthorized actions; mitigate with segregation of duties, read-only analysis first, approval routing, and audit logging. OpenAI and PwC are explicitly building finance agents for planning, forecasting, reporting, procurement, treasury, tax, and close workflows, and ChatGPT for Excel and Sheets is now generally available across plans powered by GPT-5.5.

2.6 How Can Legal and Compliance Teams Use GPT-5.5 Without Increasing Risk?

Typical scenarios: clause extraction, contract comparison, policy lookup, regulatory change triage, control narrative drafting, and first-pass risk summarization. Business value and KPIs: contract turnaround time, exception detection rate, outside counsel spend, compliance cycle time, false-positive and false-negative rates, and reviewer throughput. Technical requirements: authoritative legal and policy corpora, document management systems, strict citation discipline, and mandatory legal or compliance sign-off before final use. Main risks and mitigation: hallucinated citations, privilege leakage, and cross-border data issues; mitigate with restricted corpora, redaction, regional controls where needed, and human review. Thomson Reuters estimates that AI could free up around four hours per week in the near term, roughly 200 hours per year, and says that for U.S. lawyers this could translate into nearly $100,000 in extra billable time annually.

2.7 How Can Software Teams Use GPT-5.5 Beyond Code Autocomplete?

Typical scenarios: code generation, refactoring, debugging, test creation, legacy system discovery, architecture Q&A, and documentation generation. Business value and KPIs: lead time for change, deployment frequency, pull-request review time, defect escape rate, incident MTTR, and developer satisfaction. Technical requirements: repository and ticketing integration, access to internal documentation, CI or code-quality tooling, and secure handling of secrets. Main risks and mitigation: insecure code, leaking proprietary logic, and over-trust in generated changes; mitigate with human review, code scanning, sandboxing, and strong repo boundaries. GPT-5.5 is explicitly positioned for coding and professional work, OpenAI reports that 73% of engineers see faster code delivery, and GitHub’s controlled Copilot experiment found developers completed a coding task 55% faster on average.

2.8 How Can GPT-5.5 Help Business Leaders Analyze Data and Build Better Reports?

Typical scenarios: spreadsheet analysis, management-report drafting, dashboard explanation, anomaly triage, free-text commentary generation, and ad hoc data synthesis for leadership teams. Business value and KPIs: reporting cycle time, analyst hours saved, decision latency, insight adoption, and error rate in management commentary. Technical requirements: spreadsheets, governed metrics, warehouse or BI access, structured outputs, and validation rules for formula- or metric-sensitive work. Main risks and mitigation: spurious patterns, bad joins, and metric inconsistency; mitigate with semantic layers, approved queries, and human validation of high-impact reports. OpenAI’s own use-case guide treats data analysis as a core enterprise primitive, and its enterprise report says accounting and finance users report some of the largest time benefits.

2.9 How Can Procurement Teams Use GPT-5.5 for Vendor Research and Spend Control?

Typical scenarios: supplier discovery, spend intake, RFx summarization, procurement policy checks, vendor risk review, and purchase request routing. Business value and KPIs: procurement cycle time, PO turnaround, vendor onboarding time, savings captured, maverick-spend reduction, and approval SLAs. Technical requirements: ERP or procurement suite, contract repositories, inbox or form intake, policy knowledge base, and approval logic tied to spend thresholds. Main risks and mitigation: unauthorized purchases, recommendation bias, and supplier-data errors; mitigate with read-only research first, approval gates, and documented decision rules. OpenAI and PwC are already testing a procurement agent inside OpenAI’s own finance organization, while Ramp reported that Agent Builder cut iteration cycles by 70% and got a buyer agent live in two sprints rather than two quarters.

2.10 How Can Strategy Teams Use GPT-5.5 for Market Research and Due Diligence?

Typical scenarios: market scans, competitor analysis, sourcing memos, investment screening, due diligence support, and board-prep synthesis across internal and external evidence. Business value and KPIs: research cycle time, analyst capacity, coverage breadth, evidence quality, and decision latency. Technical requirements: web search, internal document retrieval, citations, traceability, and evaluation against known-good cases. Main risks and mitigation: low-quality external sources, shallow synthesis, and hidden falsehoods; mitigate with source-quality thresholds, analyst review, and evals based on real decision cases. OpenAI’s Deep Research is designed to search and analyze hundreds of sources for cited reports, Bain has described the tool as increasing individual research capacity, and Carlyle said OpenAI’s evaluation platform cut development time on a multi-agent due diligence framework by more than 50% while increasing agent accuracy by 30%.

3. Which GPT-5.5 Enterprise Use Cases Deliver the Fastest Business Value?

Use case	Main benefits	Key KPI	Required integrations	Main risks
Customer service orchestration	Lower cost per case, faster resolution, higher service consistency	Containment, AHT, FCR, repeat contacts, CSAT/NPS	Helpdesk, CRM, OMS/payments, policy RAG	Hallucinated answers, unsafe actions
IT and employee support	Lower ticket volume, faster IT resolution, smoother onboarding	Deflection, MTTR, SLA, onboarding time	ITSM, IdP/SSO, HRIS, knowledge base	Unauthorized changes, policy errors
Enterprise knowledge search	Faster answers, shorter onboarding, better reuse of internal know-how	Time-to-answer, search success, duplicate-ticket rate	SharePoint, Drive, Slack, GitHub, DMS, File Search	Stale or conflicting sources
Sales intelligence and proposals	Higher seller capacity, faster RFP response, better personalization	Research time, proposal turnaround, win rate	CRM, email, calendar, proposal templates	Fabricated personalization, stale CRM
Finance operations	Faster close, better forecasting, lower analysis effort	Days-to-close, forecast cycle time, variance accuracy	ERP, procurement, spreadsheets, warehouse	Control breaks, wrong calculations
Legal and compliance review	Faster first pass, lower review effort, better issue coverage	Turnaround, exception rate, reviewer throughput	DMS, CLM, policy corpus, RAG	Hallucinated citations, privilege leakage
Software engineering	Faster delivery, lower toil, better documentation	Lead time, PR time, defect escape	Repo, tickets, docs, CI tools	Insecure code, IP leakage
Analytics and reporting	Faster reporting, broader self-service analysis	Reporting cycle time, analyst hours saved	BI, warehouse, spreadsheets, semantic layer	Metric drift, spurious insights
Procurement and vendor management	Faster intake and vendor review, better policy adherence	PO cycle time, onboarding time, savings captured	ERP/procurement, contracts, risk data	Unauthorized purchasing, recommendation bias
Research and due diligence	Faster research cycles, broader coverage, better evidence traceability	Research cycle time, evidence quality, analyst capacity	Web search, internal docs, citations, evals	Weak sources, shallow synthesis

The table above is a synthesis of the benchmark evidence and platform patterns discussed in the use cases section, especially around retrieval, approvals, connected data, and workflow evaluation.

4. What Architecture Does GPT-5.5 Need for Reliable Enterprise AI Workflows?

4.1 How Do GPT-5.5, RAG and Company Knowledge Work Together?

For read-heavy enterprise AI, the default pattern is GPT-5.5 plus RAG. In practice, that means File Search over vector stores for uploaded corpora, Company Knowledge for connected apps, and source citations in the answer. When workflows need to do something rather than only summarize, add function calls, prebuilt connectors, or custom MCP servers. OpenAI’s ecosystem now supports prebuilt connectors for tools such as Google Drive, SharePoint, Dropbox, Microsoft Teams, Outlook, and Gmail, while Company Knowledge across ChatGPT can pull from Slack, GitHub, HubSpot, Asana, and more; most ERP, bespoke CRM, BI, and line-of-business transactions will still need custom APIs or MCP apps. Structured Outputs should be used whenever the model feeds downstream systems, because schema-safe JSON reduces retry logic and downstream breakage.

Reliability and scale should be engineered explicitly. Use traces to inspect every model call, tool call, and guardrail event; add task-specific evals to detect regressions; and keep human-annotated “gold” datasets for high-stakes workflows. For cost and latency, Batch API is a strong fit for offline workloads such as large-scale classification, embedding, and back-catalog document work, while Prompt Caching can materially reduce latency and input-token cost for long, repetitive enterprise prompts. Strong teams also model-mix: they reserve GPT-5.5 or stronger reasoning modes for ambiguous, long-context, or tool-heavy tasks, and use lighter models for simpler extraction or classification. Clay is a useful example of this operational pattern.

4.2 When Should GPT-5.5 Use AI Agents, Tools and Business System Integrations?

The cleanest operating model mirrors process ownership. The business owner owns the KPI and the policy boundary. The AI product owner owns prompts, tool flow, fallback logic, and the acceptance criteria for output quality. Platform and data engineering own integrations, traceability, model routing, and cost controls. Security, privacy, and compliance own retention, DLP, SIEM or eDiscovery export, access policy, and regulatory guardrails. Human reviewers sit at the final mile for sensitive actions: payment movement, legal sign-off, regulatory filing language, customer credits, account access changes, or production code merges. OpenAI’s own workflow controls align with this structure, because the platform differentiates between automatic guardrails and explicit human review before sensitive side effects.

Risk management should be handled as a design problem, not a policy memo. Bias can enter through model behavior, retrieved content, or bad training examples; mitigate with representative eval sets and human review of sensitive decisions. Privacy risk is reduced through data minimization, redaction, permission-aware retrieval, and—where required—regional projects and data residency. Security risk rises sharply when systems gain write access, so default to read-only, review every app action, and red-team for prompt injection or jailbreaks. Compliance requires logs and exportability; OpenAI’s Compliance Platform is built to feed eDiscovery, DLP, and SIEM workflows. OpenAI also says business data is not used for training by default, Enterprise supports SSO and SCIM, Enterprise and API services have SOC 2 Type 2 and ISO-aligned certifications, and regional data residency is available for eligible customers and models.

5. How Should Companies Govern GPT-5.5 in Enterprise Environments?

A strong pilot starts with one bounded workflow that is painful, frequent, and measurable, not with a vague “enterprise copilot.” OpenAI’s own guidance recommends prioritizing use cases by impact versus effort and then mapping multi-step workflows across departments. In practice, the best pilot candidates share five characteristics: clear process owner, visible baseline metrics, stable source-of-truth data, reversible outputs, and a meaningful economic unit such as cost per ticket, days-to-close, or seller hours per proposal.

Success metrics should mix business outcomes with AI quality controls. On the business side, track cycle time, backlog, SLA attainment, cost per transaction, CSAT or NPS, win rate, hours saved, and error-cost avoided. On the AI side, track grounded-answer accuracy, citation coverage, human acceptance rate, tool-selection accuracy, exception rate, policy-violation rate, and unit cost per completed workflow. A practical ROI formula is: ((hours saved × loaded labor rate) + cost avoided + revenue uplift) ÷ total program cost. That formula is simple, but the operating discipline matters more: OpenAI’s evaluation guidance explicitly argues against “vibe-based” deployment and recommends eval-driven iteration from the beginning.

6. How should an enterprise GPT pilot move from proof of concept to scale?

A successful enterprise GPT deployment should move in controlled stages: from a narrow pilot, through human-approved actions, to production hardening and cross-functional scale. The goal is not to automate everything immediately, but to build a repeatable operating pattern that can be safely expanded across the organization.

Discovery and scope: choose one workflow owner, baseline the key KPI and risk tier, and define the source systems that the GPT workflow will use.
Architecture and controls: connect retrieval layers and APIs, set role-based access control, define approval paths, and prepare the first evaluation set with guardrails.
Pilot in assist mode: keep outputs read-only or draft-only, measure quality, trace failures, and train frontline users on how to work with the system.
Approval-based rollout: enable narrow actions with human approval, add audit export, and introduce exception handling for edge cases.
Production hardening: optimize cost with model routing, caching, and batch processing, then tune prompts and evaluations weekly.
Scale across functions: replicate the operating pattern in adjacent teams and expand from one workflow to a managed portfolio of enterprise GPT use cases.

This staged approach helps companies avoid the common trap of treating GPT as a one-off productivity experiment. Instead, it turns enterprise AI deployment into a governed, measurable and scalable business capability.

The recommended motion is assist, then approve, then automate. Start with read-only or draft mode. Move next to narrow human-approved actions. Only after stable eval scores, strong auditability, and confirmed economic value should a workflow be allowed to automate more material decisions or actions. This is the difference between an AI demo and an enterprise operating capability.

7. What should enterprise leaders do next with GPT-5.5?

The best starting point is not “Where can we use GPT-5.5?” but “Which business workflows are expensive, repetitive, knowledge-heavy and measurable enough to improve?” This shift changes the conversation from experimentation to operating value. Instead of launching disconnected AI pilots, companies should identify workflows where GPT-5.5 can improve speed, quality, consistency or decision support without creating unacceptable operational risk.

For most organizations, the strongest first candidates are workflows that rely on large volumes of internal knowledge, repeated document analysis, customer or employee support, reporting, research, sales enablement or software delivery. These areas often have clear owners, visible bottlenecks and measurable KPIs. They also allow teams to start safely, because many outputs can remain in draft mode before the system is trusted with more advanced actions.

The companies that benefit most from enterprise GPT deployment will not be the ones that simply give every employee access to a powerful model. The real advantage will come from designing governed AI workflows, connecting GPT-5.5 to trusted data sources, measuring quality with evaluations, and scaling successful patterns across departments. In that sense, GPT-5.5 is not just a productivity tool. It is a foundation for a new layer of enterprise automation, decision support and knowledge work. For organizations ready to move from experimentation to scalable AI implementation, TTMS AI solutions for business can help identify high-value use cases, design secure workflows, and integrate AI with existing enterprise systems.

FAQ: GPT-5.5 use cases for enterprise

What are the best GPT-5.5 use cases for enterprise companies?

The best GPT-5.5 use cases for enterprise companies are usually knowledge-heavy, repeatable and measurable. Common examples include customer service support, internal knowledge search, software development, finance analysis, sales research, legal and compliance review, procurement support, reporting and market intelligence. These workflows are strong candidates because they often involve large volumes of text, documents, tickets, policies, data and decisions. GPT-5.5 can help teams work faster by summarizing information, drafting outputs, comparing documents, routing requests and supporting decisions with relevant context. However, the best use case is not necessarily the most impressive demo. It is the one with a clear business owner, a measurable KPI, reliable source data and a safe path from assist mode to controlled automation.

How is GPT-5.5 different from a traditional enterprise chatbot?

A traditional enterprise chatbot usually answers questions in a conversational interface. GPT-5.5 can go further because it can support multi-step workflows that include retrieval, reasoning, structured outputs, tool use and integration with business systems. This means it can help prepare reports, analyze documents, support agents, draft proposals, classify requests or guide users through complex processes. The difference is not only in the quality of the answer, but in the ability to operate inside a broader workflow. For enterprises, this matters because the real value of AI often comes from reducing process friction, not just from answering isolated questions.

Can GPT-5.5 automate enterprise workflows without human approval?

GPT-5.5 can support workflow automation, but enterprises should not move directly from experimentation to full automation. A safer approach is to start in read-only or draft mode, then introduce narrow human-approved actions, and only later automate more material decisions where the system has proven reliable. This is especially important in workflows involving payments, customer accounts, legal language, compliance obligations, access rights or production systems. Human approval is not a weakness in the early stages. It is a control mechanism that helps the organization test quality, understand edge cases and build trust before expanding automation.

What KPIs should companies track when implementing GPT-5.5?

Companies should track both business outcomes and AI quality metrics. Business KPIs may include cycle time, ticket resolution time, cost per case, proposal turnaround time, days-to-close, analyst hours saved, customer satisfaction, first-contact resolution or software delivery speed. AI-specific metrics should include answer accuracy, citation coverage, human acceptance rate, exception rate, tool-selection accuracy, policy violations and cost per completed workflow. The most mature organizations combine these measures into a regular evaluation process. This helps them move beyond subjective impressions and understand whether GPT-5.5 is actually improving performance at scale.

How should an enterprise start with GPT-5.5 implementation?

An enterprise should start with one bounded workflow rather than a broad, undefined AI initiative. The selected workflow should have a clear owner, a visible pain point, reliable source systems and measurable business value. The first phase should focus on discovery, scope, architecture, access controls and evaluation criteria. Then the company can run a pilot in assist mode, measure quality, collect feedback and gradually expand the level of automation. This staged approach reduces risk and makes it easier to replicate successful patterns across other teams. In practice, GPT-5.5 implementation is less about launching a model and more about building a controlled enterprise AI operating model.

TTMS blog – the world through the eyes of IT experts

GPT-Powered AI Agents: How to Match Autonomy to the Process?

Until recently, enterprise automation followed a simple division: systems performed tasks defined by rules, while cases requiring interpretation were passed to people. GPT-powered AI agents expand the range of processes that can be supported through automation. They can work with documents, incomplete data and the language used by customers or employees, making them suitable for processes that were previously difficult to automate. For large organisations, this raises a practical question about AI agent autonomy: where does expert support end, and where does independent action within a process begin? In some situations, the agent’s role is to gather information and prepare a recommendation. In others, it prepares an action for approval. There are also areas where it can independently carry out repetitive steps when the organisation has defined the rules, permissions, limits and exception-handling paths. GPT-powered AI agents can already support teams with ticket handling, document analysis, decision preparation, data updates and multi-step tasks. The key implementation question is: which decisions and actions should remain with people, and which can an agent perform within agreed rules? An AI agent in the enterprise is a process participant, not just a chatbot In practice, a GPT-powered agent needs five elements: access to reliable sources of knowledge, a clearly defined business objective, tools and integrations with enterprise systems, permissions aligned with its role, rules that define the boundaries of its actions. A language model can interpret the content of a document, a customer message or an incident description effectively. It does not, however, replace a business process. Workflows, permissions, validations and decision history are what make an agent operate predictably, even when it handles hundreds or thousands of cases each month. Three levels of AI agent autonomy In a large organisation, it is worth designing agents across three levels. This allows autonomy to grow alongside process maturity and trust in the solution. Operating level Agent’s role Example tasks Human role Level 1: Advisory agent Analyses information and prepares a recommendation. Case summary, risk identification, proposed response, ticket prioritisation. Makes the decision and carries out the action. Level 2: Agent preparing an action for approval Completes the next steps in a process, stopping before actions with significant consequences. Creates an application, updates data, prepares a communication, submits an instruction for approval. Reviews and approves specified steps. Level 3: Agent performing tasks automatically Independently carries out tasks in line with the process policy. Case classification, status updates, sending standard information, creating a task in a system. Handles exceptions, monitors quality and updates process rules. The level of autonomy does not need to apply to the entire agent. The same agent may independently classify tickets, prepare a response that requires approval and transfer unusual cases to an expert. In practice, an organisation therefore designs autonomy for individual decisions and actions, rather than choosing a single operating model for the whole solution. What determines whether an AI agent can complete a task independently? A useful starting point is to assess two factors: the impact of the action on the organisation and whether it can be reversed. The greater the business, legal, financial or reputational consequences of a decision, the more important human approval becomes. Nature of the action Recommended model Low impact, simple rules, easy to reverse Automatic execution with a record in the process history. Medium impact, data from several sources, possible exceptions The agent prepares the action and an authorised person approves it. High financial, legal or customer impact The agent presents analysis, options and justification. The decision remains with a person. Unclear rules, incomplete data or conflicting information Automatic escalation to an expert, together with the context and collected data. This principle is particularly useful in organisations operating across multiple countries, with complex permission structures and a large number of systems. Just as important as the list of tasks is knowing what the agent must not do and when it should hand a case over to a person. 7 questions to ask before giving an AI agent permission to act What action should the agent perform? Describe it specifically, for example: “create a service ticket”, “update contact details” or “prepare a response to a complaint”. What data will it work with? Identify the sources, data owners, update frequency and access rules. What business rules must it follow? These may include financial limits, contractual terms, SLA levels, compliance requirements or communication policies. What exceptions should stop the process? The agent needs a clear escalation path for unusual or incomplete cases, or those requiring specialist assessment. Can the action be reversed? The ease of correction affects the appropriate level of autonomy, the scope of testing and the need for additional approval. Who is accountable for the decision? The process owner, approver and technical team should all have clearly assigned roles. How will the organisation establish why the agent took a particular action? The case history should show the input data, rules, sources used, recommendation and process outcome. This is why AI agent projects often begin with bringing the process itself into order. The organisation gains more than a new AI capability: it also gains better visibility of responsibilities, exceptions and how work actually flows. Where can GPT-powered AI agents add value in a large enterprise? Customer service and back-office teams An agent can read a customer message, identify its subject, retrieve data from a CRM or case-management system, prepare a response in line with company policy and route it to the appropriate queue. For standard cases, it can also update a status, create a task for the team or send the customer a confirmation. Full autonomy works well for low-risk actions, such as providing information about the status of a ticket. Complaints, individual commercial terms or cases requiring interpretation of a contract should be passed to an employee together with the agent’s analysis. Finance, procurement and document workflows An AI agent can read a document, check whether the data is complete, compare it with a purchase order and flag discrepancies that require clarification. It can also prepare a case summary, collect missing information and initiate the appropriate approval workflow. Decision thresholds are particularly important in this area. The agent can process a document automatically when it meets all conditions, while cases that exceed a defined amount, contain discrepancies or concern a new supplier can be submitted for approval. IT, administration and ticket management In an IT environment, an agent can classify tickets, create an incident summary, search for similar cases in the knowledge base, propose actions in line with a runbook and update the user on progress. In administrative processes, it can prepare an application, complete data in a form and remind the requester about missing documents. For actions involving configuration changes, access permissions or production systems, an approval-based model is advisable. The agent reduces the time needed to prepare a decision, while the administrator retains control over the change. Sales and commercial information management An agent can prepare a briefing before a meeting by bringing together information from the CRM, proposals, correspondence and notes, then highlighting open points and suggested next steps. After the meeting, it can create a summary, propose data updates and prepare tasks for the team. These are extensions of scenarios already familiar from everyday work with generative AI. Read more about what the current generation of models helps teams achieve in our article: GPT-5.6 from OpenAI: capabilities and business applications. Why does an AI agent need a workflow? An AI agent can interpret information and suggest next steps, but the process should define the sequence of actions, required validations and the people responsible for approval. In a large organisation, this is what determines the repeatability and scalability of the solution. A process automation platform can act as a control layer: it triggers a task, provides the agent with the necessary context, receives the result, records the history and routes the case to the next stage. The agent then becomes part of a controlled workflow rather than operating as a separate tool outside the core process. This approach is relevant to document workflows, request handling, HR processes, procurement and administration. See how WEBCON BPS can support the digitalisation and control of business processes, and how TTMS delivers process automation. Four forms of human oversight of an AI agent Human-in-the-loop is a model of control embedded in the process—from reviewing recommendations to handling exceptions and making decisions with greater impact. In a mature solution, people can play several different roles. Approving an action when the agent has prepared a specific instruction, communication or system change. Selecting an option when the agent has presented several possible solutions and their consequences. Handling an exception when a case falls outside the agent’s rules, available data or permissions. Overseeing process quality by analysing errors, rejected recommendations, completion times and changing business needs. The most effective implementations use all four forms. The team does not manually review every standard operation, yet retains full control over actions with greater significance and over the direction in which the process evolves. It is also worth observing whether human approval genuinely improves process safety or simply moves a bottleneck elsewhere. If an approver nearly always accepts the agent’s proposals without changes and the cases are easy to reverse, the organisation can consider automating the selected step. If recommendations often require correction or the approver needs to return to source data, this indicates that the process rules, quality of knowledge or scope of the agent’s permissions need attention. When can an AI agent act automatically? Automation delivers the most value when a task is frequent, has a repeatable structure, relies on available data and leads to a clearly defined outcome. It is also important to ensure that execution can be verified and corrected when data or rules change. Good candidates include ticket classification, routing requests to the appropriate queue, completing data from approved sources, creating standard tasks, updating statuses and sending communications based on approved templates. Combining GPT models with an enterprise knowledge layer, integrations and security rules provides a significant advantage. This allows the solution to work with information available to a specific role, rather than with an unstructured collection of documents and conversations. When should an AI agent primarily provide advice? An advisory role is especially valuable in cases that require contextual assessment, interpretation of company policy, negotiation, an individual approach to a customer or decisions with significant financial and legal consequences. In these situations, the agent can gather facts, summarise documents, identify missing information, compare options and prepare the rationale for a recommendation. The person gains time for business judgement, while the decision remains grounded in the knowledge, experience and accountability appropriate to the role. This model is particularly useful for managers, compliance specialists, legal teams, strategic procurement, finance teams and teams responsible for key accounts. FAQ What is the difference between an AI agent and a chatbot? A chatbot primarily responds to questions in a conversation. An AI agent can also use approved tools, retrieve information from enterprise systems, follow workflow rules and complete defined process steps. Its value comes from combining language understanding with access to business context, permissions and a controlled process. Should every AI agent have human approval before taking action? No. The appropriate level of oversight depends on the impact and reversibility of the action. Low-risk, repeatable activities such as categorising tickets or sending a standard confirmation can be automated under defined rules. Actions affecting customers, contracts, finances, compliance or production systems should usually include approval or escalation to an authorised person. Can one AI agent operate at different levels of autonomy? Yes. Autonomy should be designed for individual actions rather than assigned to an entire solution. The same agent may classify a request automatically, prepare a response for approval and escalate an unusual case to an expert. This makes it possible to automate safely without treating every task in the same way. What information does an AI agent need to work reliably in an enterprise? An agent needs access to reliable and current knowledge sources, a clearly defined objective, appropriate permissions and rules for handling exceptions. It should also receive only the context relevant to the task and role. Workflows, validations and an auditable history of actions help ensure that its output can be reviewed and used consistently. How can a company start implementing GPT-powered AI agents? Start with one clearly defined process step that has measurable volume, repeatable inputs and a known outcome. Set the boundaries of the agent’s permissions, test it with standard and exceptional cases, and measure the effect on process time, quality and escalations. Once the team has evidence that the solution works reliably, its scope and autonomy can be expanded gradually.

NIS2 Compliance Documentation: What Evidence Should Businesses Prepare?

NIS2 compliance cannot be demonstrated by a policy library alone. A regulator, auditor or management body may need to understand not only what an organisation intended to do, but also whether its cybersecurity measures were approved, implemented, tested and improved over time. That distinction makes evidence management a central part of NIS2 readiness. Policies describe the expected approach. Evidence shows that people followed it, controls operated, exceptions were governed and material weaknesses reached the right decision-makers. Directive (EU) 2022/2555, known as NIS2, does not prescribe one universal folder of documents for every regulated entity. It establishes outcomes and minimum areas that essential and important entities must address through appropriate and proportionate technical, operational and organisational measures. The exact records expected from an entity depend on its risk profile, services, sector, size, national implementing law and, in some cases, sector-specific EU rules. This guide explains how businesses can build practical NIS2 compliance documentation, what evidence may support Articles 20, 21 and 23 of the Directive, and how to organise a defensible evidence pack without creating unnecessary bureaucracy. It is designed as a documentation and assurance guide—not as another general implementation roadmap or audit checklist. 1. Does NIS2 require specific compliance documentation? NIS2 does not contain a single exhaustive schedule titled “documents every entity must maintain”. Instead, it creates duties that are difficult to perform or demonstrate without reliable records. Article 20 requires management bodies of essential and important entities to approve cybersecurity risk-management measures, oversee their implementation and follow relevant training. Article 21 requires appropriate and proportionate risk-management measures covering at least ten specified areas. Article 23 establishes staged reporting for significant incidents. Supervision provisions allow competent authorities to request information and access data, documents or other evidence needed for their tasks. As a result, documentation should support three questions: What decision, process or control was required? Who approved, owned or performed it, and when? What evidence shows that it operated and produced the intended result? National legislation may define additional documents, registration information, audit requirements, reporting forms or retention periods. Organisations operating in several Member States should therefore maintain a jurisdiction register instead of assuming that one evidence pack satisfies every national procedure. Certain DNS, cloud, data-centre, managed service, managed security, online marketplace, online search, social networking and trust service providers are also subject to Commission Implementing Regulation (EU) 2024/2690. For those entities, the Regulation and ENISA’s supporting technical guidance provide more detailed requirements and examples. Other entities may use that material as a reference, but should not present it as automatically binding outside its legal scope. 2. Documentation, records and evidence: what is the difference? These terms are often used interchangeably, but separating them improves assurance. Category Purpose Examples Governing documents Define what the organisation expects and who is responsible Policies, standards, procedures, governance charters and control descriptions Operational records Show that a process or control was performed Access reviews, vulnerability tickets, backup logs, supplier assessments and training records Decision evidence Shows how risks, exceptions and priorities were considered Management minutes, risk acceptance, investment approvals and escalation records Effectiveness evidence Shows whether measures work as intended Test results, restoration exercises, metrics, audits and verified remediation Regulatory records Support registration, notification and supervisory engagement Scope analysis, authority correspondence, incident reports and information requests A policy is not proof that the process operates. A screenshot is not necessarily reliable evidence if its source, date, scope and owner are unclear. A test report has limited value if no one owns the findings or verifies their closure. Strong evidence connects design with operation. It should allow a reviewer to trace a requirement to a control, the control to its owner, the owner to operational records and any failure to a documented decision or corrective action. 3. Build a NIS2 evidence map before collecting files Collecting everything creates cost, confusion and additional security risk. A better approach begins with an evidence map. An evidence map links each applicable obligation to the entity’s controls and records. It can be maintained in a governance, risk and compliance platform or a controlled spreadsheet, provided ownership, versioning and access are appropriate. Evidence-map field What to record Legal or control reference Applicable NIS2 article, national provision, implementing rule or internal control requirement Expected outcome The risk or service outcome the measure should achieve Control description How the organisation addresses the outcome Owner and operator Who is accountable and who performs the activity Evidence source System, repository or process producing the record Frequency or trigger Monthly, quarterly, annually, after change or after an incident Reviewer Who assesses completeness and effectiveness Retention and protection How long evidence is kept and how access and integrity are protected Status and exceptions Current result, open gaps, accepted risk and remediation The map should reflect the services and systems in scope. A generic template can accelerate the work, but it should not become the basis for unsupported declarations. Where a control does not apply, the organisation should record the reason rather than leaving an unexplained blank. 4. Scope and applicability records An organisation cannot build credible NIS2 compliance documentation without first establishing which entities and services are covered. Scope evidence is especially important for corporate groups, cross-border operations and businesses whose activities cross several sectors. A documented applicability file may include: a list of relevant legal entities and establishments; services and activities mapped to Annex I or Annex II of NIS2 and corresponding national provisions; employee and financial data used for size classification; analysis of partner and linked enterprises where relevant to SME calculations; size-independent rules and any specific designation decisions; jurisdiction and competent-authority mapping; interaction with sector-specific EU legislation, such as DORA; legal advice or internal interpretation supporting uncertain classifications; review triggers for acquisitions, new services, restructuring and legislative change. The purpose is not to produce a long legal memorandum for every entity. It is to make the conclusion reproducible. A reviewer should be able to see what facts were considered, which version of the law was used and who approved the result. Scope documentation should also identify the network and information systems supporting covered services. Legal entity boundaries do not always match technical boundaries. Shared identity platforms, cloud tenants, data centres or managed providers may support several companies and services, making dependency evidence important. 5. Governance and management-body evidence Article 20 makes management involvement a substantive requirement. Evidence should show more than the presence of cybersecurity on an annual agenda. 5.1 Approval of cybersecurity risk-management measures Approval evidence can include board or management-body minutes, resolutions, decision papers and approved policy sets. The record should identify what was approved, the scope of the decision, material risks, known limitations, required resources and the reporting mechanism used to oversee implementation. Where a large package is approved, a controlled index can identify all included documents and their versions. This avoids uncertainty about whether a policy was actually part of the decision. 5.2 Oversight of implementation Oversight records may include periodic dashboards, risk committee minutes, programme status reports, overdue-action escalations and decisions concerning residual risk. Reporting should enable informed challenge. Useful indicators connect controls with service outcomes. Examples include the proportion of critical services covered by tested recovery plans, overdue remediation for critical vulnerabilities, privileged access awaiting review, critical suppliers without current assurance and high-risk audit findings past their agreed date. Raw activity counts are weaker. The number of alerts processed or employees trained may be relevant, but it does not by itself show whether the organisation can protect and recover its services. 5.3 Management training Training evidence should record the audience, date, subject matter, facilitator and completion. The content should help management understand its responsibilities, the entity’s threat and risk profile, significant-incident escalation, risk acceptance and oversight expectations. An attendance list alone may not demonstrate that training was suitable. Agenda materials, learning objectives, exercises or confirmation of understanding provide stronger context. 5.4 Accountability and delegated responsibility An organisation should maintain a current responsibility model. This can include governance terms of reference, role descriptions, RACI matrices, escalation paths and authority for risk acceptance. Operational tasks may be delegated to security teams, technology owners or providers. Documentation should still show how the management body receives assurance and how material matters are escalated. Outsourcing a control does not outsource the regulated entity’s responsibility for managing its risk. 6. Risk-analysis and treatment evidence Article 21 begins with policies on risk analysis and information-system security. A defensible evidence trail demonstrates that risk management affects decisions and investment. Core documentation may include: the approved cybersecurity risk methodology; risk criteria, impact scales and likelihood definitions; a service, process, information and technology inventory; risk assessments and a current risk register; treatment plans with owners, resources and deadlines; risk acceptance and exception records; reassessment after material change or an incident; links between risks, controls, suppliers and continuity priorities. The risk register should not be an isolated spreadsheet owned only by the security team. Material risks need accountable business owners and a route to management. Treatment records should make clear whether the organisation is reducing, avoiding, transferring or accepting the risk. Exceptions require particular care. A patching exception, unsupported system or delayed access review should identify the affected service, reason, compensating measures, approver, expiry date and review. Open-ended exceptions weaken both security and evidence quality. 7. Evidence for the Article 21 risk-management areas The following examples illustrate records that may support the ten minimum areas in Article 21. They are not a universal statutory checklist. Article 21 area Examples of useful evidence Risk analysis and system-security policies Methodology, risk register, policy approvals, review history and exception records Incident handling Response plan, severity criteria, incident tickets, communication logs, exercise reports and lessons learned Business continuity, backup and crisis management Business impact analysis, recovery objectives, continuity plans, backup monitoring, restoration results and crisis exercises Supply-chain security Supplier inventory, risk tiering, due diligence, security clauses, assurance reports, monitoring and exit plans Secure acquisition, development and maintenance Security requirements, architecture reviews, secure-development records, change approvals, vulnerability tickets and patch evidence Assessment of effectiveness Control testing, penetration tests, audits, metrics, findings and verified remediation Cyber hygiene and training Baseline standards, update and configuration records, role-based training, simulations and follow-up actions Cryptography and encryption Cryptographic policy, approved standards, key and certificate inventories, rotation logs and exception decisions Human resources security, access and assets Screening where lawful, joiner-mover-leaver records, access reviews, privileged-account evidence and asset inventories MFA and secure communications Coverage reports, enrolment and recovery controls, exception records, authentication tests and emergency communication exercises Evidence must remain proportionate. A small important entity and a multinational essential entity may address the same legal area with different operating models and documentation depth. The key question is whether the record is sufficient to demonstrate the control in the context of the entity’s risk. 8. Incident-reporting documentation Article 23 requires essential and important entities to notify significant incidents through a staged process. The Directive provides for an early warning without undue delay and within 24 hours of awareness, an incident notification within 72 hours, and generally a final report within one month of the incident notification. Intermediate or progress reports may also be required. Incident evidence should support both response and the reporting decision. Useful records include: the time and source of initial detection; the point at which the organisation became aware of the incident; technical and business severity assessments; the significant-incident assessment and its approver; affected services, systems, users and other persons; suspected malicious or unlawful activity; indicators of compromise and cross-border implications where available; containment, mitigation and recovery actions; copies of regulatory submissions and acknowledgements; customer, contractual, data-protection and law-enforcement communications; decision logs showing what was known and unknown at each stage; root-cause findings, lessons learned and corrective actions. 8.1 Preserve a reporting timeline The reporting clock can begin before a complete forensic conclusion is available. A reliable timeline is therefore essential. Systems should use synchronised time sources, and the incident lead should record material decisions as they occur. The evidence should distinguish facts, assumptions and pending investigation. Early notifications can be qualified. A clear record of uncertainty is more credible than retrospective notes that imply the organisation knew everything at the start. 8.2 Document non-reporting decisions Not every security event meets the threshold of a significant incident. When an event is assessed as non-reportable, the organisation should retain a proportionate record of the facts, criteria and decision. This helps demonstrate consistency and enables later reassessment if the impact changes. National law, authority guidance and the Commission Implementing Regulation for specified digital and ICT entities may provide additional thresholds and procedural detail. Reporting templates and contact information should be maintained for every relevant jurisdiction. 9. Supply-chain and supplier evidence Supplier documentation should show that the organisation understands which relationships could affect its covered services and applies scrutiny proportionate to the risk. An evidence set may contain: a supplier inventory linked to services and information assets; inherent-risk and criticality classifications; due-diligence questionnaires and supporting documents; independent assurance reports and certifications, with scope and exceptions reviewed; security requirements in contracts and statements of work; incident-notification and cooperation provisions; subcontracting, location and concentration-risk information; access granted to supplier personnel and periodic access reviews; performance, vulnerability and incident monitoring; reassessment records following change or an incident; continuity, substitution and secure-exit plans. A certificate should not be stored without analysis. Its scope, period, exclusions and relationship to the delivered service matter. Similarly, a completed questionnaire is a supplier statement, not independent proof. Higher-risk suppliers may require interviews, technical evidence, independent reports or contractual verification rights. Documentation should also record the organisation’s response to deficiencies. Accepting a supplier risk without an owner, expiry date or compensating measure creates an unmanaged exception. 10. Business continuity, backup and recovery evidence Continuity documentation should connect business priorities with technical recovery capability. The evidence chain may begin with business impact analysis and service dependency maps. These should support recovery time and recovery point objectives, response priorities, backup architecture, alternative procedures and supplier arrangements. Operational evidence can include: current continuity, disaster-recovery and crisis-management plans; protected backup configuration and monitoring; restoration tests showing which data and systems were recovered; actual recovery duration compared with approved objectives; test limitations, failures and remediation; exercise attendance, decisions and lessons learned; emergency contacts and out-of-hours escalation checks; evidence that critical providers participated where relevant. A successful backup job is not the same as a successful recovery. Evidence should demonstrate that required data can be restored into an operable service under plausible conditions. Testing should vary scenarios. Tabletop exercises are useful for decisions and communication, while technical restoration tests provide evidence of recovery capability. More complex entities may use integrated exercises involving suppliers, facilities and business teams. 11. Security-control and technical evidence Technical evidence is often abundant but difficult to interpret. The objective is not to export every log. It is to retain records that demonstrate scope, operation, review and response. Examples include: approved secure-configuration baselines and compliance reports; vulnerability scan coverage and remediation tickets; patch status linked to criticality and exceptions; endpoint, network and cloud monitoring coverage; identity and privileged-access reviews; MFA coverage and bypass exceptions; encryption, key and certificate management records; change approvals and security testing; secure-development and dependency-scanning results; asset inventory completeness checks; alert investigations and response outcomes. Tool screenshots should be used carefully. Prefer repeatable reports or system exports with the source, timestamp, query scope and responsible reviewer recorded. Evidence should be protected from unauthorised modification, particularly when it may support an investigation. 12. Evidence that measures are effective Article 21 includes policies and procedures to assess the effectiveness of cybersecurity risk-management measures. This means documentation should go beyond implementation status. An effectiveness file can combine: defined control objectives and success criteria; control self-assessments; technical testing and independent review; security and resilience metrics; internal and external audit reports; incidents and near misses indicating control performance; trends and recurring weaknesses; corrective actions with owners and deadlines; proof that high-risk remediation was independently verified. Metrics should be interpreted. For example, “98% of critical systems patched on time” requires a defined population, a reliable inventory, treatment of exceptions and information about the remaining 2%. A positive average can conceal exposure in a critical service. Management reporting should distinguish control design, implementation and effectiveness. A control can be well designed but inconsistently operated, or widely deployed but ineffective against a realistic threat. 13. How to assemble a NIS2 evidence pack An evidence pack is a controlled view of relevant records, not a permanent duplicate of every operational file. 1. Start with an index The index should identify the requirement, document or evidence item, owner, version or period, source location, access classification and review status. It should also identify unavailable evidence and open remediation. 2. Use service-based navigation Regulatory obligations apply to entities, but operational impact occurs through services. Organising evidence around covered services helps reviewers understand dependencies, risks, controls and recovery priorities. 3. Select representative periods and samples Evidence should show operation over time. One access review performed immediately before an assessment does not demonstrate a mature quarterly process. Samples should cover the relevant period, locations and technologies. 4. Preserve source and context Each item should make clear where it came from, who produced or approved it, the date, scope and meaning. Remove unexplained screenshots, unlabeled exports and drafts that could be mistaken for approved records. 5. Record gaps honestly Do not create evidence retrospectively to imply that an activity occurred. Where evidence is missing, document the gap, immediate risk response, owner and remediation date. Transparent remediation is more defensible than an unreliable record. 6. Perform quality review Legal, security, risk and service owners should check consistency. The asset inventory should agree with vulnerability coverage. Supplier classification should drive assurance. Recovery objectives should match test reports. Management minutes should reflect the material risks shown in dashboards. 14. Evidence quality principles A practical evidence standard can be expressed through seven characteristics: relevant: it supports a defined requirement or control; authentic: its origin and ownership can be established; complete: it includes the scope and context needed for interpretation; accurate: it reflects what actually occurred; timely: it covers the required period and was produced at the appropriate time; protected: access, integrity and confidentiality are controlled; retrievable: authorised teams can find it when required. These principles help teams decide whether a proposed record adds assurance or merely volume. 15. Retention, confidentiality and evidence security NIS2 does not establish one universal retention period for every type of compliance record. Retention should be determined using national requirements, limitation periods, sector rules, contractual duties, audit cycles, incident-investigation needs and the organisation’s risk. Evidence may contain sensitive architectural details, vulnerabilities, personal data, credentials, supplier information or legal advice. It should be classified and protected accordingly. Collecting material for an assessment does not justify placing unrestricted copies in a shared folder. The organisation should define: approved repositories and access roles; version control and approval status; retention and defensible disposal; legal hold and investigation procedures; integrity protection and backup; secure transfer to auditors or authorities; handling of personal and privileged information; return or deletion of assessment copies. Data minimisation matters. Evidence should be sufficient for its purpose without exposing unnecessary personal data, secrets or complete security configurations. 16. Common NIS2 documentation mistakes 16.1 Treating policies as proof of operation Policies establish intent. They need corresponding reviews, logs, tests, decisions and corrective actions. 16.2 Collecting screenshots without context A screenshot may not show the source, date, population, filters or reviewer. Use controlled exports and explanatory notes where possible. 16.3 Building the evidence pack only before an audit Last-minute collection produces gaps and inconsistent records. Evidence generation should be embedded into normal control operation. 16.4 Keeping expired exceptions open Exceptions should have owners, compensating measures and expiry dates. Repeated extensions require appropriate challenge and escalation. 16.5 Storing sensitive evidence too broadly Centralisation improves retrieval but can create a valuable target. Use classification, least privilege, logging and secure transfer. 16.7 Ignoring contradictory records An approved policy may claim quarterly reviews while operational records show annual activity. Resolve discrepancies instead of presenting them as separate truths. 16.8 Equating certification with complete NIS2 evidence ISO/IEC 27001 certification can provide useful governance and control records. It does not automatically demonstrate legal scope, national notification procedures or every NIS2 outcome. The certification scope and statement of applicability must be understood. 17. NIS2 compliance documentation checklist Use this checklist as a planning aid. Adapt it to the entity, national law and risk profile. [ ] Applicability and jurisdiction analysis is documented and approved. [ ] Covered services, systems, data, people, facilities and suppliers are mapped. [ ] Management approval of risk-management measures is traceable. [ ] Management oversight and cybersecurity training records are current. [ ] Roles, escalation paths and risk-acceptance authority are defined. [ ] Risk methodology, assessments, register and treatment plans are maintained. [ ] Policies are version-controlled, approved and linked to operating procedures. [ ] Incident records preserve awareness, decisions, actions and reporting timelines. [ ] Non-reporting decisions for material events use documented criteria. [ ] Continuity and recovery documentation is linked to critical services. [ ] Backup and restoration evidence demonstrates recoverability. [ ] Supplier inventory, classification, due diligence and monitoring are current. [ ] Security clauses and supplier-exit arrangements reflect criticality. [ ] Vulnerability, patch, configuration and change records show control operation. [ ] Access, privileged accounts and MFA exceptions are reviewed. [ ] Cryptographic keys and certificates are governed and monitored. [ ] Training evidence is role-based and includes effectiveness indicators. [ ] Control testing and audits produce owned, time-bound remediation. [ ] High-risk findings have verified evidence of closure. [ ] Evidence retention, access, integrity and secure transfer are defined. [ ] The evidence index identifies missing or outdated records. [ ] Evidence is reviewed after major incidents, changes and regulatory updates. 18. How this guide fits with implementation and audit work Documentation should emerge from real controls. Organisations that are still designing their programme can use TTMS’s practical guide to implementing NIS2 for a broader implementation perspective. For a general overview of business duties, see cybersecurity obligations of businesses under NIS2. An implementation programme creates and operates controls. An evidence programme makes their ownership, decisions and results demonstrable. An audit or assessment then evaluates whether the measures and evidence satisfy the applicable criteria. These activities support one another but should not be confused. 19. Why TTMS? Building a NIS2 evidence model requires an understanding of regulation, governance and the technology that generates operational records. TTMS can support organisations in mapping applicable requirements to services, controls, owners and evidence sources, then integrating those records into practical workflows. Support may include evidence-readiness assessments, governance and responsibility design, control mapping, documentation frameworks, supplier assurance, incident and continuity exercises, technical-control verification and remediation planning. The objective is not to create documents for their own sake. It is to help the organisation establish records that reflect working security measures and provide management with reliable assurance. Engagement scope should be tailored to the entity’s legal position, national requirements, risk profile and existing management systems. Legal conclusions should be confirmed by appropriately qualified advisers, while technical and organisational evidence should support those conclusions accurately. 20. Prepare a defensible NIS2 evidence pack Organisations should not wait for an authority request or audit notice before locating their records. Start with the services in scope, the decisions management must make and the controls protecting those services. Then identify which reliable records demonstrate operation and effectiveness. Contact TTMS to discuss a NIS2 documentation and evidence-readiness assessment tailored to your organisation. For authoritative background, consult the European Commission overview of the NIS2 Directive, ENISA’s NIS2 implementation resources and the official text of Directive (EU) 2022/2555 on EUR-Lex. 21. Frequently asked questions about NIS2 compliance documentation What documentation is required for NIS2 compliance? NIS2 does not prescribe one universal document pack. Covered entities need records sufficient to demonstrate management approval and oversight, appropriate and proportionate risk-management measures, significant-incident reporting and compliance with applicable national procedures. Typical evidence includes scope analysis, governance decisions, risk records, policies, operational control records, supplier assurance, continuity tests, incident files, metrics, audits and remediation. Is a policy enough to prove NIS2 compliance? No. A policy describes the intended approach. Evidence of operation may include approvals, system records, reviews, test results, incidents, exceptions and corrective actions. A reviewer should be able to connect the policy to actual controls and accountable owners. Does NIS2 require an information security management system? NIS2 requires a governed set of appropriate and proportionate cybersecurity risk-management measures. National law may expressly require an information security management system, and an ISMS is a practical way to organise policies, risk management, controls and improvement. Organisations should verify the terminology and detailed requirement in each relevant jurisdiction. Does ISO 27001 certification provide sufficient evidence? ISO/IEC 27001 certification can provide valuable evidence, but it is not automatic proof of complete NIS2 compliance. The certification scope, exclusions and statement of applicability matter. Legal scope, management duties, national registration and incident-reporting procedures still require specific assessment. How long should NIS2 evidence be retained? There is no single NIS2 retention period covering every record. The organisation should define retention using national law, sector obligations, audit cycles, limitation periods, contractual duties, investigation needs and risk. Sensitive evidence should be disposed of securely when retention is no longer justified. Should every security log be placed in the evidence pack? No. The pack should provide a controlled view of relevant evidence. Operational logs may remain in their source systems, with an index describing ownership, scope, retention and retrieval. Export only what is necessary and protect sensitive technical information. What evidence should the management body receive? Management should receive information enabling approval and effective oversight: material risks, measure implementation, significant incidents, control failures, critical supplier exposure, effectiveness results, overdue high-risk actions and decisions requiring acceptance or investment. How should incident-reporting decisions be documented? Record awareness time, affected services, severity and impact, applicable thresholds, known and unknown facts, the decision-maker and the basis for reporting or not reporting. Keep copies of notifications, acknowledgements and subsequent updates. Follow applicable national procedures. What supplier evidence is useful for NIS2? Useful records include supplier criticality, due diligence, assurance reports, contractual security provisions, access reviews, monitoring, incident cooperation, continuity arrangements and exit plans. Evidence depth should reflect the supplier’s access and potential impact on covered services. How often should the NIS2 evidence pack be reviewed? Set a risk-based schedule and update records through normal operations. Additional review should follow material incidents, acquisitions, major system or service changes, new critical suppliers, significant control failures and legal updates. Who should own NIS2 compliance documentation? Ownership is distributed. Legal or compliance teams may maintain the requirements map, while security, IT, service owners, procurement, HR and continuity teams own operational records. A central coordinator should manage the evidence index, quality checks and escalation without becoming the artificial owner of every control.

ChatGPT 5.6 in Practice: Initial Compliments and Disappointments

OpenAI rolled out GPT-5.6 in stages. It first appeared in limited test access for selected partners. Access to ChatGPT 5.6 reached Europe, including Poland, gradually, so only recently have teams been able to test the model in everyday work. Expectations are high. In the second half of 2026, businesses expect language models to handle multi-step tasks and work with extensive context. Ease of use matters too. GPT’s interface has undergone a major redesign. Has it improved the user experience and the quality of responses? This article explores that question, as well as: which business processes ChatGPT 5.6 can support by improving productivity and the quality of working materials, how to plan an AI pilot in your organisation, measure results and maintain quality control, which limitations of ChatGPT 5.6 to consider before a wider rollout, how to establish a shared standard for prompts and output validation across the team, what early users think about working with ChatGPT 5.6. If you are looking for a full overview of the changes, pricing, models and capabilities of GPT-5.6, see our article GPT-5.6 from OpenAI: what has changed, pricing, capabilities and business applications. ChatGPT 5.6: our first impressions and early industry feedback Early expert reviews focus primarily on context handling. Reviewers note that when working with substantial material that goes through multiple rounds of edits, ChatGPT 5.6 is better at keeping the task on track. Most of us have experienced earlier OpenAI models losing their “bearing”. On top of that, the model itself encouraged endless revisions, which could pull the material away from the original intent of the prompt. GPT 5.5 had an irritating habit of suggesting more and more variations. Almost every response ended with a clickbait-style suggestion along the lines of: “If you want, I can help you add two elements that will create a wow effect and give the text around 50% more SEO power.” As a result, instead of closing the topic, we were drawn into the model’s endless doubts: could the material really not be improved further? GPT 5.6 is no less capable than the older model, but it finally respects what matters most: the intent behind the prompt and our time. Kajetan Terlecki SEO Specialist, TTMS Another recurring observation concerns the quality of the first draft—the material GPT produces after the first prompt. Reviewers emphasise that the model’s draft is usually well structured and much closer to a final version than it was with GPT 5.5. It is not a perfect ten yet, but a solid eight. In other words, a final version may be within reach after a relatively short time. With earlier GPT models, the “brainstorming” phase took much longer. The third—and most immediately noticeable—area is the way we use the tool, which we can simply call the “interface”. It is admittedly quite complex. Beyond writing a prompt, users must make a series of decisions: which workspace should I choose: Chat or Work? which model best fits my request: Luna, Terra or the most advanced Sol? Or is the older GPT 5.5 enough? does the task require Deep Research? how much effort should the model put into the task: low, medium, high, very high, max or ultra? should I use Turbo mode and generate a response 50% faster at the cost of higher token use? If we add the almost endless range of available plugins, writing the prompt turns out to be only half the work required to get a useful result. I would welcome an automatic mechanism that reads the prompt and selects the right settings on its own. One that uses a sufficiently capable GPT model without wasting tokens when they are not needed. How do you navigate all this? We have outlined a suggested configuration here, including which modes to use for different types of tasks. Where does GPT 5.6 outperform the previous version? 1. GPT 5.6 is better at preserving document layout and formatting The previous version of GPT had something of a goldfish memory. You could also compare it to a short blanket: pull it over one part, and another is left exposed. When we asked the model to update data in a document it had generated, it produced a factually correct response, but one that no longer followed the original format. It might use a different heading hierarchy, rearrange the information or omit elements that are essential for the company. GPT 5.6 is much better at preserving the structure of reference material. OpenAI illustrated the difference in materials introducing GPT-5.6. The company placed three slides side by side: the reference file, the GPT-5.5 output and the GPT-5.6 output. The task was to update figures in a presentation while retaining the original template. In the comparison, GPT-5.5 omitted some template elements, while GPT-5.6 preserved the slide structure more faithfully: layout, typography, spacing, colours and recurring template elements. OpenAI states that GPT-5.6 can also interpret rules saved in the slide template, including the Slide Master. In practice, this matters when a presentation needs to retain not only its colours and fonts, but also defined layouts, spacing and mandatory components. 2. GPT-5.6 moves beyond the chat window GPT-5.6 shows its greatest potential when it works not only with a single instruction, but also with files and tools made available by the user. It can then move quickly through a task: from gathering the materials to preparing a first draft. The new GPT model can identify related files in a project folder, flag places that need updating and prepare working versions of documents. There is a catch: the process still needs human oversight. Someone must check whether GPT found all the relevant files, understood the context correctly and left unchanged the elements that were meant to remain unchanged. Still, instead of manually digging through documents, the team starts with a list prepared by the model. 3. From an idea to a version you can show the team Experts testing GPT 5.6 point out that the first version of a simple application, dashboard or website is now more often suitable for showing to a team and collecting specific feedback. It is somewhat like an MVP: good enough to test an idea, present it to the team and gather initial comments. A product owner can see the whole process, a designer can assess the layout and usability, and a developer can spot technical constraints sooner. This does not mean that GPT-5.6 creates a finished product. The initial prototype still needs to be assessed for security, quality and architecture. The difference is concrete, however: the team can evaluate an actual solution earlier, rather than debating assumptions alone. 4. GPT 5.6: “I don’t know” — is this the end of answers given for the sake of answering? We all know the old classified ad: “Encyclopaedia Britannica, 40 volumes for sale. I got married a week ago, so I no longer need it. My wife knows everything better.” The know-it-all syndrome is a nuisance not only in old marriage jokes, but also for people who work with language models every day. GPT often lacks the information needed to give a reliable answer. GPT-5.5, like earlier versions, would rather provide an incorrect—yet convincing-sounding—answer than admit it did not know. What about the new version? The change is visible at first glance, even though it is hard to capture in a benchmark and easy to appreciate in day-to-day work. Our first days of working with the two most advanced models, Terra and Sol, suggest that GPT 5.6 is more likely to say “I don’t know”, “I don’t have enough data” or “I could not find anything else on this topic”. People still need to add or verify information manually, but this reduces the risk of an embarrassing error in material prepared for a client, the board or a project team. Before you give GPT-5.6 an important task: what to watch out for in early testing 1. A working prototype is not yet a finished product GPT-5.6 can prepare a website, dashboard or simple application that can be launched and shown to the team. This is a major step forward, particularly when testing an idea. The tests also reveal the other side: elements can become misaligned, interactions do not always work as intended, and visual details still require refinement. The first version can be an excellent starting point, but it should not automatically be sent to clients or other external audiences. Before treating it as finished, we need testing, a security assessment and, in some cases, a developer’s review. 2. The new Work environment can still be frustrating Model quality is one thing. The way we use it in practice is another. One reviewer pointed out that, in Work, it was difficult to access generated files and open a preview of the finished result. Others criticised the number of settings—discussed earlier in this article—as well as the unclear distinction between Chat, Work and Codex. GPT-5.6 may complete a task correctly, while the working environment still makes it difficult to retrieve or review the result. It is worth testing the entire process, not only the quality of the response in the chat window. 3. GPT needs clear boundaries One reviewer tested how GPT-5.6 would handle a complex mathematical problem. The model produced correct parts of the solution, but surrounded them with definitions, digressions and comments that added little value. Only after the instruction was made more specific did it produce a useful result. The same applies in a business context. We should not leave the model too much room for interpretation. It is better to state the expected result directly: “Prepare a one-page summary. Include the decision, three arguments, risks, missing information and next steps.” GPT then has fewer opportunities to pad the topic with peripheral content. 4. GPT can still be wrong The fact that GPT-5.6 appears more likely to signal that it lacks data or a basis for drawing a conclusion does not mean it is free from hallucinations. Luna, Terra and Sol—with Sol seemingly the least prone to this—can still provide an incorrect date, number, source or conclusion without batting an eyelid. The rule to “check after AI” still applies and will likely remain relevant for many future GPT releases. 5. Start with one problem, not a large system Once GPT-5.6 has access to files, a browser and company tools, it is easy to imagine a system that instantly organises the inbox, analyses team communication, updates the CRM and writes responses to clients. This vision can quickly turn into a project larger than the problem it was meant to solve. One expert working with an extensive Codex environment recommends starting with a single, repeatable task. It might be preparing a meeting summary, gathering open project issues or updating an offer after data changes. Only once the team sees measurable results and understands the tool’s limitations is it worth adding further automations. How should you run your first ChatGPT 5.6 test in the company? A pilot should answer one straightforward question: does GPT-5.6 genuinely improve a selected stage of work, and does the benefit justify the time, cost and additional quality control? The first test should not begin with building an extensive automation system. It is better to choose one repeatable task that currently takes up the team’s time and has a clearly defined outcome. This might be a meeting summary, a brief or a status report. What matters is that the team knows which materials it provides to the model, what result it expects and who reviews the final document. Before starting the pilot, answer five questions: Choose one process: for example, preparing meeting summaries, sales briefs or materials for project decisions. Set a baseline: measure the time needed to prepare the material, the number of revisions, the number of people involved and the most common errors. Prepare a shared prompt: use the same input materials and clearly describe the outcome the team expects. Assign expert review: nominate a person who will verify the facts, assess quality and approve the result before it is used further. Assess the outcome: compare time, the number of iterations, completeness of the material and the usefulness of the result for the next stage of the process. Pilot element Question for the team Process Which stage of work do we want to shorten or organise? Outcome What should be produced: a brief, decision list, analysis, recommendation or communication draft? Data Which materials are needed, and can they be used in the selected AI environment? Quality control Who confirms the facts, completeness and alignment of the material with the process? Metric How will we compare working time, the number of revisions and the usefulness of the result? After a few attempts, it becomes easier to assess whether the model is genuinely helping. Compare the time needed to prepare the material, the number of revisions and the effort required to verify the result. Only then decide whether to extend the pilot to further tasks. Three processes worth starting with 1. Summaries after client meetings The model can organise notes, gather decisions, identify open questions and prepare a list of next steps. The team confirms the arrangements and assigns task owners. This helps them move from discussion to action more quickly. 2. A brief for a sales conversation Based on selected sales materials, previous arrangements and public information about the company, GPT-5.6 can prepare a brief, discovery questions and a list of topics that require clarification. The salesperson remains responsible for the client relationship and decisions regarding the offer. 3. A status report for the project team The model can organise information about progress, blockers, risks and planned actions. The project owner confirms that the information is up to date before the report is shared further. This reduces the time the team spends manually consolidating data from several sources. How do you embed AI in a business process? After the pilot, it becomes clear whether ChatGPT 5.6 genuinely shortens the preparation of materials, reduces the number of revisions and helps the team move more quickly to the next stage of work. It also reveals where the model needs a better brief, access to data or expert oversight. Proven use cases can then be extended to other processes. At this stage, it is worth addressing data security, integration with existing tools, output quality and a clear division of responsibilities. These factors determine whether AI becomes lasting support for the organisation. At TTMS, we help organisations identify processes where automation and AI create business value. We then design solutions tailored to their data, regulatory requirements and ways of working. We combine engineering experience with a responsible approach to AI governance, confirmed by ISO/IEC 42001 certification. Let’s discuss the processes AI could support in your organisation. FAQ How do you choose a process for your first ChatGPT 5.6 test? The best candidate is a repeatable process that requires gathering several pieces of information and producing a predictable result. Examples include meeting summaries, sales briefs, status reports and document analysis. The team should know the current turnaround time and typical issues, as these provide the baseline for assessing the test. Start with one process and expand the use of AI only after evaluating the outcome. How do you measure the business value of ChatGPT 5.6? During a pilot, measure the time needed to prepare the first version of the material, the number of revisions before approval, the completeness of the output and the expert time required for verification. It is also useful to track metrics related to the next stage of the process – for example, faster meeting preparation, a shorter time to close agreed actions or fewer missing details in a report. This data helps assess team productivity based on actual results and supports decisions about integrating AI into further processes. What data should you prepare for working with ChatGPT 5.6? The model produces better results when the team provides current, well-organised source materials. Before starting, identify which documents take priority, which data must remain unchanged and how unverified information should be marked. The organisation should also define which data can be shared in the chosen AI environment. For personal, financial and confidential data, access rules, retention and compliance are essential. How do you maintain human oversight of the model’s work? Human oversight should be part of the process from the start. The process owner defines the task scope, an expert verifies facts and alignment with requirements, and an authorised person approves external actions. This division of responsibilities is particularly important for client communication, publications, data changes in systems and materials with legal or financial implications. It allows the team to use automation while retaining responsibility for the outcome. Where can I find information about GPT-5.6 pricing, models and capabilities? We have covered the changes in GPT-5.6, pricing, the Sol, Terra and Luna models, and business applications in a separate article: GPT-5.6 from OpenAI: what has changed, pricing, capabilities and business applications. This article focuses on the practical use of ChatGPT 5.6 in team workflows, early user experiences and how to run an AI pilot in an organisation.

5 Most Common Gaps Identified When Preparing for KSC 2.0

Preparing an organization for KSC 2.0 involves more than drafting security policies and incident response procedures. Only an assessment of how the organization actually operates can show whether documented rules are followed in practice, responsibilities have been clearly assigned and teams can respond effectively under time pressure. It is particularly important now that the Polish amendment to the Act on the National Cybersecurity System, implementing the NIS2 Directive, is already in force. The provisions took effect on 3 April 2026. Entities that met the criteria for classification as a key or important entity on that date and are not entered ex officio in the KSC Register should submit an application for entry by 3 October 2026. Organizations are therefore no longer preparing for a future regulation; they are implementing specific obligations concerning, among other things, risk management, incident handling, business continuity and supplier security. Based on gap analyses and compliance audits conducted by TTMS experts in 2026, we have observed that the issue is rarely a single isolated non-compliance. More often, organizations face several interconnected deficiencies that can make it harder to meet statutory requirements and delay incident response. This article presents the five gaps we identify most frequently, their practical consequences and the areas that should be verified first. 1. What Is a NIS2 Audit and Why Does Your Organization Need One? A NIS2 audit is an assessment process used to determine how effectively an organization meets the requirements of the Directive and the Polish Act on the National Cybersecurity System. In practice, TTMS auditors review IT systems, risk management procedures and incident response plans, and then compare the actual state of operations with the applicable legal obligations. The assessment of security measures is based primarily on Article 21 of the NIS2 Directive and Article 8 of the KSC Act, which requires the implementation of an information security management system. Organizations that verify compliance early gain time to implement improvements in a controlled manner instead of acting under the pressure of an inspection. 1.1 The NIS2 Directive in Brief The NIS2 Directive is an EU legislative act on the security of network and information systems that replaced the earlier NIS framework. It introduces significantly stricter requirements than its predecessor, particularly for organizations whose operations are important to the functioning of the state and the economy. Its purpose is to harmonize security standards across the European Union and materially strengthen resilience against cyberattacks. 1.2 Purpose and Scope of a NIS2 Compliance Audit The purpose of an audit is to assess the extent to which an organization meets the Directive’s requirements and to identify specific security gaps together with a remediation plan. The scope covers both technical matters, such as network configuration and access management, and organizational matters, including security policies, risk management procedures and business continuity plans. A well-executed audit produces an actionable implementation roadmap, not merely a list of deficiencies. 2. Who Is Subject to KSC 2.0 and When Is an Audit Required? The amendment covers key and important entities operating in the sectors listed in Annexes 1 and 2 to the Act, including energy, transport, healthcare, digital infrastructure, selected manufacturing industries and digital services. Whether an organization falls within the scope of the Act depends on its sector, type of activity, company size and specific statutory criteria. Some entities are covered regardless of their headcount or turnover. 2.1 Covered Sectors and Company Size The threshold of 50 employees or EUR 10 million in turnover should not be treated as a standalone test. In many sectors, medium-sized or large-enterprise status is the starting point, but the Act provides exceptions and separate qualification rules. The first step should therefore be to compare the organization’s actual activities with Article 5 and Annexes 1 and 2 to the KSC Act. 2.2 Key Entities and Important Entities: Differences in Requirements Key and important entities are generally subject to a similar set of obligations relating to risk management, incident handling and supply-chain security, subject to the exceptions provided for in the Act and sector-specific regulations. The primary differences concern the supervision and audit model. Under Article 15 of the KSC Act, a key entity must conduct a security audit at its own expense at least once every three years. The competent authority may order an external audit of a key entity at any time and of an important entity following a significant incident or another breach of the Act. 3. Is a NIS2 Audit Mandatory and When Should It Be Performed? Not every gap analysis offered on the market constitutes a statutory audit. The periodic audit obligation under Article 15 applies to key entities, while a voluntary gap analysis can help both key and important entities assess readiness, set priorities and gather evidence of compliance. A statutory audit must be conducted by an organization or by at least two auditors meeting the qualification requirements set out in Article 15(2), while complying with the independence requirement in Article 15(2a). 3.1 Key KSC 2.0 Deadlines in Poland Poland implemented the NIS2 Directive through the Act of 23 January 2026 amending the Act on the National Cybersecurity System and certain other acts (Journal of Laws of 2026, item 252). The Act was published on 2 March 2026, and its principal provisions entered into force on 3 April 2026. For entities that met the criteria for classification as a key or important entity on the effective date, self-registration in the KSC Register runs from 7 May to 3 October 2026, unless the entity is entered ex officio. Entities in this group should comply with the obligations in Chapter 3 no later than 3 April 2027. Key entities in this group must conduct their first statutory security audit by 3 April 2028. For entities brought within the scope of the Act at a later date or entered by administrative decision, the applicable deadline must be determined under the provision governing the relevant procedure. 3.2 How Often Should a Compliance Audit Be Repeated? Under Article 15 of the KSC Act, the statutory audit of a key entity must be conducted at least once every three years. Irrespective of that requirement, we recommend an annual internal compliance review and an additional assessment after any material change, such as an IT infrastructure upgrade, implementation of a new system, a significant incident or a change of a critical service provider. Security cannot be configured once and then forgotten. 4. Consequences of NIS2 Non-Compliance Failure to perform the obligations arising from the KSC Act implementing NIS2 may have serious consequences. These include supervisory measures, orders to remedy infringements and administrative fines. 4.1 Financial Penalties and Administrative Sanctions Entities that fail to perform their obligations under the KSC Act may be subject to supervisory measures and administrative sanctions. The Act provides for high maximum penalties and, where an infringement creates a particularly serious threat, a fine of up to PLN 100 million. Under Article 35 of the amending Act, the new penalties specified in that provision may first be imposed two years after the Act entered into force, generally from 3 April 2028. This does not postpone the deadlines for registration, implementation of obligations or incident reporting. 4.2 Management Liability and Reputational Risk Failure to perform statutory obligations may also result in a personal fine being imposed on the head of a key or important entity. Article 73a of the KSC Act provides for a fine of up to 300% of the person’s remuneration and, for certain public-sector entities, up to 100% of remuneration. The person regarded as the head of a particular entity depends on its legal form and governance structure. Irrespective of sanctions, an incident and disclosed negligence may also undermine the trust of customers and business partners. 5. What Does a NIS2 Audit Cover? This part of our work as auditors is particularly revealing because it shows precisely where organizations encounter the most common difficulties. Below, we describe the five areas in which we most frequently identify gaps during KSC 2.0 readiness projects, together with practical examples and the consequences of leaving them unresolved. 5.1 Unclear Accountability and Immature Risk Management The first thing we verify is who formally holds responsibility for cybersecurity within the organization. Our experience shows that unclear accountability is one of the most frequently identified issues. Roles across IT, security and management may be documented, yet in practice there is no unambiguous decision-making path for every type of significant incident. Valuable hours are then spent determining who is authorized to make a decision instead of responding to the incident. This issue is closely linked to immature risk management. Many organizations have a document entitled ‘Risk Management Policy’, but the assessment was performed only once and has not been updated since. Article 21 of the NIS2 Directive and Article 8 of the KSC Act require appropriate and proportionate technical, operational and organizational measures based on systematic risk management. If an organization cannot demonstrate a recurring process, it also lacks a reliable understanding of where it is genuinely most exposed. 5.2 Incomplete IT and OT Asset Inventory An incomplete or outdated inventory of IT and OT assets appears very frequently in our assessments. A typical example is a manufacturing company that declares full control over its infrastructure, yet during workshops no one can clearly state how many active servers it operates, which systems are outdated or which OT devices can access the corporate network. Without a reliable inventory, risk assessment becomes largely theoretical: an organization cannot assess the risk associated with an asset it does not know exists. During an incident, the team then loses time determining what has actually been compromised. 5.3 Untested Incident Response Procedures Our observations indicate that, in most organizations assessed, the incident response procedure existed only as documentation and had never been tested in practice. Article 23 of the NIS2 Directive and Article 11 of the KSC Act provide for multi-stage reporting: an early warning must be submitted without undue delay and no later than 24 hours after detecting a significant incident, followed by an incident notification no later than 72 hours after detection. The required reports must then be submitted, including a final report generally within one month of the incident notification. The procedure must therefore work at night, at weekends and when key personnel are unavailable. 5.4 Inadequate Business Continuity Plans An incident response procedure is not sufficient if the organization cannot maintain or restore critical services. In practice, we verify whether business continuity and disaster recovery plans cover critical dependencies, suppliers, backups, crisis communications and realistic recovery times. Article 21(2) of the NIS2 Directive and Article 8 of the KSC Act identify business continuity, backup management, disaster recovery and crisis management as elements of cybersecurity risk-management measures. A plan that has never been tested remains an assumption rather than evidence of resilience. 5.5 No Systematic Supplier Risk Assessment Supplier security management remains one of the greatest challenges. In the vast majority of organizations assessed by TTMS, there was no systematic evaluation of risks associated with service providers or partners that had access to the organization’s systems. Article 21(2)(d) of the NIS2 Directive and Article 8 of the KSC Act expressly cover supply-chain security. A typical example from our work is an external IT provider with remote access to company systems whose security controls have never been verified. An attack on such a partner can directly threaten the organization using its services. 5.6 Summary of the Five Most Common Gaps Area Observation from TTMS Projects 1. Accountability and risk management Frequently identified issue 2. IT and OT asset inventory Very frequent 3. Testing of incident response procedures Most organizations assessed 4. Business continuity Often requires additional testing and clarification 5. Supplier risk assessment The vast majority of organizations assessed High-risk gaps identified in a single audit Usually between one and several The data in the table consists of anonymized qualitative observations from gap analyses and audits conducted by TTMS in 2025–2026. It is not a representative market study. 7. How a NIS2 Audit Works: Step by Step Below, we explain how we conduct a NIS2 audit for a client, step by step, from the initial contact through to the completed remediation roadmap. Step 1: Determine Whether the Organization Is Subject to KSC 2.0 The first step is to establish whether the organization is subject to the KSC Act and whether it qualifies as a key or important entity. This determination defines the subsequent scope of the assessment and the obligations that must be considered. Step 2: Questionnaire and Baseline Data Collection We then conduct a detailed questionnaire and collect baseline information from the IT, security and management teams. This allows us to build an initial picture of the organization’s security posture before examining the documentation in detail. Step 3: Review of Documentation and Processes The next stage involves reviewing the documentation and existing processes, comparing what is written on paper with what actually happens within the organization. This is where the discrepancies described earlier most often become visible, such as an incident response procedure that exists but has never been tested. Step 4: Workshops and Team Interviews We conduct workshops and interviews with employees from different departments because documentation rarely tells the whole story. A conversation with a network administrator or the person responsible for supplier relationships often reveals more than a formal review of documents. Step 5: Findings and Recommendations Report At the end of the assessment, we prepare a detailed report presenting the findings and specific remediation recommendations in language that is understandable not only to IT, but also to the organization’s management. The head of the entity and the relevant governing bodies are responsible for approving and overseeing implementation of the measures to the extent required by the Act and the entity’s governance structure. Step 6: Remediation Roadmap The final report includes a prioritized remediation roadmap. In practice, we typically identify between one and several high-risk non-compliances during a single audit. The roadmap is therefore not about implementing every recommendation at the same time, but about sequencing activities to reduce the most significant business risks as quickly as possible. 8. How to Prepare Your Organization for a NIS2 Audit Preparing for a NIS2 audit requires involvement from every department, not only IT. It is worth collecting current security policy documentation, a list of systems and external suppliers, and appointing a person to act as the auditors’ primary point of contact. The better prepared the organization is at the outset, the faster and more efficiently the process can be completed, reducing both cost and pressure on the team. 9. NIS2 Audits and Other Security Audits: Key Differences A NIS2 and KSC 2.0 compliance assessment differs from other security reviews because it addresses specific regulatory obligations arising from the Act on the National Cybersecurity System. ISO/IEC 27001 certification is generally voluntary, while a GDPR compliance audit focuses on personal data protection obligations. These scopes may partially overlap, but none of them automatically replaces an assessment of compliance with KSC 2.0. 10. Benefits of Commissioning a NIS2 Audit from TTMS TTMS is a global IT company specializing in the implementation and maintenance of bespoke IT systems, business process automation and outsourcing services. With experience in systems integration, Salesforce, Microsoft and AEM implementations, as well as IT service management, our consultants understand not only regulatory requirements but also the real-world IT infrastructure architectures our clients operate. 10.1 Scope and Delivery of Our Service We provide a comprehensive NIS2 and KSC 2.0 readiness and gap assessment covering all the areas described above: from asset inventory, risk management and incident response procedures to supply-chain security. We follow a proven process, starting with an initial questionnaire, continuing through team workshops and concluding with an actionable roadmap. If the engagement includes a statutory audit under Article 15, the scope, auditor qualifications and independence requirements must be confirmed separately. 10.2 Support with Implementing Post-Audit Requirements The real value of an audit lies in implementing its recommendations, not merely producing a report. After completing projects, we observe that clarifying accountability, updating documentation and implementing remediation measures shorten incident response times, improve asset records and reduce the number of non-compliances found during subsequent reviews. Our support includes security process automation, integration of monitoring systems and development of procedures that work in teams’ day-to-day operations. 11. Contact a TTMS Expert and Prepare Your Organization for a NIS2 Audit 11.1 Make Sure Your Organization Is Ready for KSC 2.0 KSC 2.0 readiness is difficult to assess from documentation alone. The key is to verify whether responsibilities, processes and safeguards work in practice and whether the organization can demonstrate compliance during an audit or inspection. If you would like to discuss your organization’s situation, contact TTMS experts. We will help determine which areas require verification, what audit scope is appropriate and where preparations should begin. We will tailor the engagement to the entity’s status and its obligations under KSC 2.0. 12. Legal Basis and Sources Directive (EU) 2022/2555 of the European Parliament and of the Council (NIS2), in particular Articles 20, 21, 23, 32 and 33; the Act of 5 July 2018 on the National Cybersecurity System, as amended by the Act of 23 January 2026 (Journal of Laws of 2026, item 252), in particular Articles 5, 8, 11, 15, 73 and 73a and Annexes 1 and 2; Articles 33–35 of the amending Act; and communications from the Polish Ministry of Digital Affairs concerning the KSC Register and the S46 System. The legal status and implementation timeline were verified on 13 July 2026. 13. FAQ Is a Gap Analysis the Same as a Statutory KSC Audit? No. A gap analysis is a voluntary readiness assessment that helps identify deficiencies and prioritize actions. A statutory security audit under Article 15 of the KSC Act must meet the requirements relating to scope, auditor qualifications and independence. What Is a NIS2 Compliance Audit? A NIS2 compliance audit is a market term for an assessment process that verifies an organization’s readiness for the requirements of the NIS2 Directive and the KSC Act. It may cover IT systems, risk management and incident response. However, not every such review constitutes a statutory security audit under Article 15 of the KSC Act, which must meet the applicable requirements concerning scope, auditor qualifications and independence. What Does NIS2 Involve? NIS2 is an EU directive that introduces rigorous network and information systems security requirements for organizations in key and important sectors. Its purpose is to harmonize security standards across the European Union and strengthen resilience against cyberattacks. How Much Does a NIS2 Audit Cost? The cost of a NIS2 audit depends on the size of the organization, the number of systems and locations covered by the review, and the scope of support required to implement the recommendations. An accurate quotation can be provided after a short initial discussion in which we establish the actual scope of work.

Best QA Practices in Software Testing – 2026 Guide

Quality assurance has moved well beyond end-of-cycle sign-offs. Today, the best QA practices in software testing are woven into the full development lifecycle, shaping how teams write requirements, review code, deploy releases, and measure outcomes. Yet despite widespread awareness of this shift, many organizations still struggle to close the gap between knowing what good QA looks like and actually executing it at scale. This guide brings together the most effective software testing best practices for 2026, covering everything from requirement alignment and shift-left integration to test automation strategy, environment management, and continuous improvement. Whether you’re building out a new QA function or refining an existing one, these principles offer a practical, experience-backed foundation for improving quality across your entire software delivery process. 1. Software Testing Best Practices for a Scalable and Effective QA Process Scalable QA doesn’t happen by accident. It requires a deliberate combination of process design, tooling, collaboration, and measurement that evolves alongside the product it supports. The most effective QA testing workflows share a common structure: quality is considered at every stage of development, not bolted on at the end. What separates high-performing teams from the rest isn’t always budget or headcount. It’s how consistently they apply software testing best practices across people, processes, and tools. Throughout this guide, those practices are organized into the stages where they have the most impact, giving teams a clear path to a more reliable and scalable QA process in software testing. 2. Why Most QA Processes Fall Short (and How High-Performing Teams Fix It) Most QA failures don’t come from a lack of testing tools. They come from structural and cultural gaps that quietly erode quality over time. Research consistently identifies the same root causes: siloed ownership between QA and development, unstable test environments, poor risk prioritization, and automation strategies that create more maintenance burden than value. High-performing teams fix this by shifting their thinking before they shift their tooling. They embed QA early, distribute quality ownership across roles, and use data to drive process decisions rather than team performance scores. Collaboration between developers, product managers, and QA engineers replaces the handoff model, and shared definitions of “done” replace ambiguous release criteria. The result is a software QA process that catches issues earlier, releases more confidently, and improves continuously. 3. Start With Requirements: The Foundation of Effective QA No testing practice compensates for unclear requirements. Vague, incomplete, or late-changing software testing requirements are among the most common sources of downstream bugs, rework, and test coverage gaps. Strong QA processes address this upstream, before a single line of code is written. 3.1 Align QA Objectives to Business and User Goals Effective QA begins with understanding what success actually looks like for the business and the user. When QA objectives are disconnected from business outcomes, testing can produce impressive pass rates while missing the behaviors that matter most to real users. Aligning your QA testing approach to business goals means involving QA in stakeholder conversations early, mapping test coverage to user journeys, and treating product quality as a measure of value delivered, not just defects avoided. Establishing measurable quality requirements early is essential. Rather than vague descriptors like “the system should respond quickly,” well-aligned criteria look like “response time must be under 200 milliseconds.” This kind of specificity, co-developed with product managers and stakeholders, prevents downstream surprises and keeps testing efforts anchored to what the business actually needs. 3.2 Define Acceptance Criteria Before Writing a Single Test Acceptance criteria function as the contract between what is built and what is expected. Defining them before any test cases are written is one of the most impactful quality assurance best practices a team can adopt. Structured formats like Given/When/Then (used in behavior-driven development) make criteria clear, testable, and accessible to both technical and non-technical stakeholders, which is especially valuable in complex user flows. 4. Shift Testing Left: Involve QA Earlier in the Dev Cycle Shifting testing left means moving quality activities earlier in the software development lifecycle, from a post-development checkpoint to an active part of planning, design, and coding. This is one of the most consistently recommended QA best practices in agile environments, and for good reason. 4.1 How Shift-Left Testing Reduces Bug Costs The cost of fixing a defect grows substantially the later it’s discovered. A bug caught during requirements review takes minutes to resolve. The same bug found in production can take days and trigger cascading failures. Shift-left testing compresses this gap by creating faster feedback loops, where QA, developers, and product managers are aligned on expected behavior before implementation even begins. Early QA involvement also reduces rework. When testers participate in architecture and design reviews, they surface quality risks and ambiguous requirements before they become coded assumptions. This is what some industry practitioners now call “shift-smart” testing: moving beyond just earlier testing to applying the right test thinking at each stage of the SDLC. 4.2 Practical Ways to Integrate QA Into Planning and Design Phases Integrating QA into planning doesn’t require a process overhaul. In agile environments, it starts with including QA engineers in sprint planning sessions as active participants in defining user story complexity, identifying testability concerns, and agreeing on acceptance criteria before stories are marked “ready for development.” QA teams should review user stories and wireframes alongside business analysts, flagging gaps or compliance requirements (such as HIPAA or GDPR considerations) before coding begins. These testing prerequisites become part of the “Definition of Ready,” ensuring that development only starts on work that is properly specified and testable. 5. Building a Scalable QA Strategy and Testing Approach A strong QA strategy is more than a list of test types. It’s a deliberate plan for how quality will be ensured across every layer of the application, aligned to team capacity, risk tolerance, and delivery speed. Building that strategy well from the outset prevents technical debt in testing from accumulating alongside the product itself. 5.1 What a Strong QA Test Plan Covers A QA test plan serves as the governing document for a team’s testing activities. It defines the scope of testing, the objectives aligned to release goals, the methodologies to be applied (such as functional, regression, performance, or API testing), the required testing prerequisites, the roles and responsibilities, and the entry and exit criteria for each phase. Without this structure, testing becomes reactive and inconsistent. 5.2 Risk-Based Prioritization: Test What Matters Most First Not all features carry equal risk, and testing everything equally is neither feasible nor effective. Risk-based prioritization is one of the most valuable quality assurance practices a team can adopt, directing testing effort toward the areas most likely to fail and most damaging if they do. This means analyzing each feature or component based on its business criticality, complexity, change frequency, and historical defect rate. A checkout flow in an e-commerce platform carries far more risk than an infrequently used settings page, and the test suite should reflect that reality. This approach prevents the false confidence that comes from high test counts with low-impact coverage. 5.3 Building a Sustainable Test Automation Strategy Automation is a force multiplier for QA teams, but only when it’s applied strategically. A sustainable test automation strategy focuses on automating tests that are stable, repetitive, and executed frequently, while keeping manual testing in place for scenarios that require judgment, creativity, or exploratory thinking. Modern teams use two or more automation frameworks, with platforms like Playwright and Selenium becoming standard. Teams evaluating their automation platform have several strong options, including open-source frameworks built around Playwright or Cypress and dedicated low-code tools. For example, Qatana platform is one approach, supporting Playwright-based automation within a unified hybrid testing workflow that links manual test cases with automated execution. The right choice depends on your team’s existing stack, required integrations, and maintenance capacity. 5.4 When to Automate and When Not To Automation is not always the right choice. Tests that are highly unstable, rarely executed, or deeply reliant on visual or contextual judgment often cost more to maintain than they save in execution time. New features in active development are also poor candidates for early automation, since requirements change frequently and automated scripts break just as quickly. The guiding principle is simple: automate for consistency and speed, test manually for discovery and judgment. Regression suites, smoke tests, API validations, and data-driven tests are natural candidates for automation. Exploratory testing, usability assessments, and complex user journey validation are better handled by skilled quality assurance manual testing practitioners who can adapt in real time. 5.5 Keeping Tests Maintainable: Avoiding Automation Debt Automation debt accumulates when test scripts are written without maintenance in mind. Fragile locators, hardcoded values, test cases with no clear owner, and suites that haven’t been reviewed in months all contribute to an automation layer that slows teams down rather than speeding them up. This is one of the most common QA process improvements teams neglect until it becomes a crisis. Regular test suite reviews, clear ownership of automated tests, modular script design, and retiring outdated tests are the core disciplines that keep automation sustainable. Teams using any modern platform, whether Qatana, or an in-house Playwright setup, can apply the same principle: treat test maintenance as a recurring sprint task, not a quarterly cleanup. Qatana specifically addresses this by using AI to generate draft test cases and support regression suite selection from release notes and ticket content, reducing the manual overhead of keeping test suites current. 6. Integrate QA Into CI/CD Pipelines Continuous integration and continuous delivery pipelines are now the standard deployment model for fast-moving teams. Integrating QA into these pipelines ensures that quality is validated at every stage of the release process, not just at the end. 6.1 Automated Tests in the Merge and Release Process Automated tests that run on every merge request create an immediate feedback loop for developers. When a code change introduces a regression, it’s caught within minutes rather than at the end of a sprint. That’s the core value of CI-integrated testing: defects are surfaced at the point of introduction, when they’re cheapest and fastest to fix. 6.2 Enforcing Quality Gates Without Slowing Delivery Quality gates work best when they’re designed as enabling constraints rather than blocking checkpoints. A gate that requires all tests to pass before a merge can proceed maintains quality without demanding human review of every change. The key is ensuring that the automated test suite is fast, focused, and reliable enough that gates don’t become a bottleneck. This requires ongoing investment in test suite performance: parallelization of test execution, targeted smoke tests for rapid validation, and selective regression runs based on code change impact. Modern CI/CD deployment systems are built around this principle, enabling frequent and reliable updates without sacrificing confidence in the release. 7. How to Continuously Improve Your QA Process The strongest QA teams treat quality as an evolving discipline rather than a fixed process. As products, teams, and release cycles change, testing practices must adapt as well. Continuous improvement comes from regularly reviewing what creates value, what introduces friction, and where time is being spent unnecessarily. Modern QA organizations rely on visibility and data to guide these improvements. Metrics such as test execution trends, regression cycle duration, defect escape rates, and coverage gaps help teams identify bottlenecks before they become larger delivery problems. The goal is not to measure individual performance, but to understand how effectively the overall QA process supports product quality. AI is increasingly helping teams accelerate this improvement cycle. Instead of spending hours maintaining test documentation or manually selecting regression suites, QA teams can use AI-assisted tools to streamline repetitive activities and focus on higher-value work. Continuous improvement also requires regular maintenance of testing assets. Test cases, automation suites, and workflows should be reviewed, updated, and retired when they no longer provide meaningful coverage. Teams that treat their testing assets as living resources, rather than static documentation, are better positioned to scale quality as their products evolve. 8. Staying Current with Qatana: Navigating AI-Assisted QA in 2026 AI is rapidly becoming part of everyday QA workflows, but the most successful teams aren’t treating it as a replacement for human expertise. Instead, they’re using it to reduce repetitive work, accelerate test creation, and improve visibility across the testing lifecycle. The most practical AI applications in QA today include: Generating test cases from tickets, requirements, and release notes Supporting regression planning by identifying the most relevant test suites Reducing time spent on test documentation and maintenance Helping teams manage growing test repositories more efficiently Providing faster insight into testing progress and release readiness At TTMS, this is the approach we adopted when building Qatana. Rather than positioning AI as a black-box decision-maker, we use it to assist testers throughout the QA process while keeping human validation at the center of every workflow. Qatana helps teams generate draft test cases, select regression suites based on project changes, and maintain visibility across both manual and automated testing in a single environment. The goal isn’t to replace testers – it’s to free them from repetitive administrative work so they can focus on quality decisions, risk assessment, and release confidence. As AI adoption continues to grow, governance becomes just as important as capability. Organizations should establish clear review processes for AI-generated outputs and ensure that testing activities remain traceable and auditable. For regulated environments in particular, responsible AI practices are essential to balancing efficiency gains with compliance requirements. Contact us to try Qatana!

E-Learning Analytics in practice: How to interpret e-learning platform data?

It is easiest to measure what is visible right away: logins, clicks, time spent on the platform, completed modules, and quiz scores. That is why many organizations finish their training analysis right there. However, there is a difference between LMS activity and real learning. And completing a course does not always mean that the participant has acquired knowledge and will apply it in their job. So, how do we measure the real impact of training on an organization? We answer this question in this article using professional methods of learning analytics for e-learning. 1. Why does the interpretation of e-learning data make it difficult for organizations to assess training effectiveness? Assessing training effectiveness often looks simple only at the beginning. The platform shows reports, charts, quiz scores, and completion statuses. However, in our work with clients, we see that the problem begins when we need to answer a much harder question: did this training actually change anything in people’s work? It is easiest to measure activity. The LMS will show who completed the training, how much time they spent in the course, what their test score was, and whether they returned to the materials. This is necessary e-learning analytics data because it helps to see whether the participants went through the learning process at all and where they might have stopped. However, it does not yet tell us whether the employee used the new knowledge after closing the course. This is exactly where many organizations fall into a trap. Course completion starts to be treated as proof of effectiveness. Yet, a salesperson might pass a test on knowledge of a new product but still not introduce it into conversations with customers. A customer service employee may know a new procedure but, under time pressure, revert to old habits. It is even more difficult to show the impact of training on business. Here, LMS data analytics alone are no longer enough. You need to combine them with what is happening in the organization: sales results, customer satisfaction, the number of operational errors, onboarding time for new employees, or the level of compliance with procedures. Only such a combination of data allows you to check whether the training translated into real change. 2. The Kirkpatrick Model. How to use it in practice? In assessing the effectiveness of training, the Kirkpatrick Model is often used. Implementing the Kirkpatrick model is the foundation on which professional data analytics e-learning processes are based. It helps to structure thinking: from participant reaction, through learning, behavior change, all the way to business results. And it clearly shows why “course completed” alone is not enough if an organization wants to know whether e-learning really works. In practice, it is worth planning the measurement method even before the training begins. Thanks to this, the organization knows from the very start what data it will collect and what effects it wants to achieve. At the first level, participant reactions can be measured using short satisfaction surveys. At the second level, knowledge is checked – most often through tests, quizzes, or practical tasks completed after the course. The third level requires observing what happens after the training. Depending on the topic, this could involve conversations with supervisors, work quality analysis, evaluation of new skills, or comparing results before and after the training. This is precisely where organizations most often discover that a high test score does not always translate into a change in behavior. The fourth level focuses on business results. For onboarding, this could be the time to reach independence, the number of errors, or the completion of the onboarding path. In compliance training, the level of adherence to procedures, audit results, or the number of incidents are often analyzed. In sales, it might be the results of salespeople and the use of product knowledge in customer conversations, and in customer service, the level of customer satisfaction and the time to resolve inquiries. From our experience, organizations achieve the best results when they do not limit themselves to a single metric. Combining LMS data, behavioral observations, and business metrics provides a much more complete picture of training effectiveness than a test score or course completion rate alone, showcasing the true power of corporate e-learning analytics. Kirkpatrick Level What do we measure? Example metrics Key question 1. Reaction How participants received the training satisfaction survey, usefulness rating, participant feedback Was the training clear and useful to them? 2. Learning What the participants learned test score, quiz, practical task, certificate Did the participant actually acquire new knowledge or skills? 3. Behavior Whether participants apply knowledge at work supervisor observation, work quality, number of errors, feedback conversations Is the employee doing something differently after the training? 4. Results What is the impact on business sales, onboarding time, customer satisfaction, compliance with procedures, number of incidents Did the training bring a real benefit to the organization? One of the most common mistakes is treating the LMS report as a complete answer to the question of training effectiveness. If we look exclusively at course completion, test scores, or time spent on the platform, we only see participant activity and not a real change in their work. Without referencing the assumed training objectives to the employee’s performance before and after the training, it is difficult to assess whether the program actually improved skills. In practice, this means that LMS data should be contrasted with supervisor observation, the quality of tasks performed, the number of errors, employee independence, or other metrics linked to the training goal. The simplest way to enrich such an analysis is through post-training surveys delayed in time. A question asked a week, a month, or a quarter after the training often says more than a survey completed immediately after the course. Only then can you check whether the knowledge was used in practice, rather than just memorized for the sake of the test, using advanced e-learning data analytics. Norbert Kulski Head of BI & Automation Solutions | TTMS 3. What statistics do most e-learning platforms provide and what do they measure? Modern LMS platforms allow you to track dozens of different performance indicators. The problem is that not all of them say the same about the actual results of learning. It is worth knowing which data are truly valuable and how to interpret them correctly with the help of professional e-learning analytics. 3.1 Completion Rate The Completion Rate shows what percentage of participants completed the training. It is one of the most frequently monitored metrics because it is simple to measure and allows for a quick assessment of participant engagement. If a large portion of users do not finish the course, it may indicate problems with the training’s length, difficulty level, or attractiveness. At the same time, a high Completion Rate does not automatically mean that the training was effective. A participant may complete a course solely because it is required by the organization. The mere fact of clicking the “Complete” button says nothing about whether they acquired new competencies and are using them at work. 3.2 Time Spent Time Spent measures the time spent by a user in the course. At first glance, it may seem that the more time a participant dedicated to the training, the more they learned. In practice, this metric can be highly misleading. A long duration does not always mean active learning. A user might leave a browser tab open while performing other duties, take a coffee break, or get distracted from the training by other tasks. On the other hand, a very short time does not necessarily indicate a problem – an experienced employee may go through the material quickly because they already know some of the topics. Therefore, Time Spent is worth analyzing only in combination with other metrics. 3.3 Quiz Scores Quiz Scores show the results of tests and knowledge checks. This is one of the best ways to assess whether a participant has retained key information from the training. Test results also help identify areas that require additional explanation or improvement of materials. However, it is important to remember that a high test score does not always mean acquiring competence. A participant might memorize the correct answers to questions but have difficulty applying this knowledge in a real-world professional situation. Therefore, tests are best at verifying knowledge, not actual behavior change. 3.4 Login Frequency Login Frequency shows how often users return to the training platform. Regular logging in can indicate participant engagement and that the training serves as a source of knowledge used in their daily work. This is a particularly valuable metric for development programs, competence academies, or knowledge bases. However, the login frequency metric itself does not show what the user did after logging in. Frequent visits do not always mean active learning, just as less frequent logins do not have to mean a lack of interest. 3.5 Course Progress Course Progress allows you to track which stage of the training participants are currently at. This is one of the most practical metrics for instructional designers. Thanks to it, you can quickly spot the places where users most often drop out of learning or lose interest. If the majority of participants drop out in the same module, it is worth analyzing its length, difficulty level, way of presenting content, or quality of interactions. Such data often allow for the detection of problems that satisfaction surveys or test results do not show. From our experience, analyzing participant drop-out points is one of the fastest ways to improve the quality of existing courses and increase the effectiveness of the entire development program using LMS data analytics. Norbert Kulski Head of BI & Automation Solutions | TTMS Metric What does it show? What is worth remembering? Completion Rate Whether the participant completed the course Training completion does not yet mean that the participant has acquired competencies or changed their way of working. Time Spent How much time the user spent in the course A long time in the course can mean learning, but also an open tab, a break, or a lack of concentration. Quiz Scores What results the participant achieved in tests A high test score shows information retention, but not always the ability to apply it in practice. Login Frequency How often the user returns to the platform Frequent logins can indicate engagement, but they do not show the quality of learning. Course Progress What stage of the course the participants are at The most valuable are the places where users stop learning – that is often where the real problems of the course are visible. 4. E-Learning Analytics – what is the difference between learning analysis and activity reporting? Many organizations today use reports available in LMS platforms. Thanks to them, you can quickly check who completed the training, how much time they spent on the course, or what score they achieved in the test. The problem is that these are primarily descriptive data that show what happened. Proper analysis of LMS platform data must go beyond simple reports. The answer to this is learning analytics for e-learning, which goes a step further. Instead of focusing solely on numbers, it helps understand the reasons behind specific behaviors and discover patterns that can affect learning effectiveness. You could say that the LMS answers the question “what happened?”, while e-learning analytics helps answer the question “why did it happen?”. For example, a report alone might show that 30% of participants did not complete the training. E-learning analytics, however, allows you to check where users drop out most frequently, what content causes them difficulty, and whether the problem stems from the course design, the difficulty level of the material, or the way the knowledge is presented. In practice, training data analysis should lead to asking questions that help improve the learning process: Which modules cause the participants the most difficulty? Which test questions most frequently result in errors? At what stage do users most often interrupt the training? Which materials do they return to repeatedly? Which content is skipped or scrolled through the fastest? Which elements of the course best support knowledge retention? Effective reporting in e-learning requires going beyond standard LMS data. From our experience, the greatest value comes not from simply collecting data, but from using it for continuous training improvement. Even minor changes in places where users encounter difficulties can significantly improve the effectiveness of the entire development program. Norbert Kulski Head of BI & Automation Solutions | TTMS Reporting in LMS E-Learning Analytics Shows what happened Helps understand why it happened Who completed the training? Why did some participants not complete the training? What was the test score? Why do participants make errors in specific areas? How much time was spent in the course? Which elements of the course engage, and which cause dropouts? How often do users log in? Which materials are actually being used at work? Historical data Conclusions leading to training optimization Measuring activity Improving the learning process I see the greatest value in training data when it stops serving a purely reporting function and begins to support course development. By analyzing the points where participants most frequently stop learning, make mistakes, or return to specific materials, we can very quickly identify elements that need improvement. From an e-learning design perspective, this rarely means having to rebuild the entire course. More often, precise adjustments are enough: simplifying a selected module, adding a practical example, shortening a lesson that is too long, or changing the form of interaction. Such decisions are worth making based on actual data, rather than on intuition alone. The most effective organizations treat training as solutions that constantly evolve. Each subsequent edition of a course provides new information, allowing them to systematically increase its effectiveness and better respond to the needs of the participants. Mikołaj Korzeniowski E-learning Tech Lead at TTMS | Product Owner of AI4E-learning 5. Measuring completion rate is not enough. For years, researchers studying learning processes have pointed out that the ease of learning can be misleading. Robert Bjork, a professor of psychology and author of the concept of desirable difficulties, showed that conditions that make learning seem easy and fluid often lead to poorer long-term knowledge retention. A good example is mathematics exercises. If we solve only one type of task for an hour, by the end of the class we might feel that the material has been mastered. Both the teacher and the students see rapid progress. However, when the test takes place a few weeks later, the results are often disappointing. Research shows that better results are achieved by interleaving different types of tasks, even though participants make more mistakes during learning and feel it is more difficult. Paradoxically, this extra effort leads to more permanent retention and more effective use of knowledge in the future. This is an important lesson for e-learning creators too. A training course that is fast, easy, and hassle-free to complete will not always be the most effective. Sometimes, greater value is delivered by a course that requires active thinking, decision-making, problem-solving, or recalling previously acquired knowledge. Therefore, it is worth keeping in mind the three levels of training effectiveness (Kirkpatrick Model) in relation to the course completion rate: Completion does not mean understanding A participant can go through all modules and obtain a certificate without absorbing key information. Understanding does not mean application An employee can answer test questions correctly but fail to use the new knowledge during their daily work. Application does not yet mean a business result Even if employee behavior changes, the organization still needs to check whether this translated into better sales, higher service quality, fewer errors, or other expected results. This is exactly why the best organizations do not stop at completion rate analysis, but instead investigate the metrics of e-learning analytics much more thoroughly. They treat it as a starting point, not proof of training effectiveness. Real value only appears when participant activity data is combined with information on behavior change and business outcomes. Learning Analytics Myth Reality A high completion rate means effective training. The completion rate only shows participant activity. High test scores guarantee behavior change. Knowledge does not always translate into action. Behavior change automatically improves company performance. Business impact requires additional measurement and analysis. A single metric can evaluate training effectiveness. Effectiveness must be analyzed on multiple levels simultaneously. 6. SCORM limitations and xAPI (Experience API) capabilities in training process analysis For many years, SCORM was the standard in the e-learning world. It allowed organizations to check basic information about participant progress: who completed the training, what test score they achieved, or how much time they spent in the course. The problem is that modern learning is increasingly less likely to take place solely within an LMS. Employees watch instructional videos, use knowledge bases, participate in webinars, perform practical tasks, learn in mobile apps, and collaborate with other employees. Traditional SCORM was not designed to track such activities. In practice, SCORM primarily answers the question: “Did the participant complete the training?” On the other hand, more and more organizations want to know: How did the participant learn outside the LMS? Which materials did they return to? Which resources do they use in their daily work? What actions do they perform after completing the training? Is the knowledge still being used after several weeks or months? It is precisely for these needs that the xAPI (Experience API) standard, also known as Tin Can API, was developed. SCORM xAPI Tracks primarily activity within the LMS course Tracks activity across the entire learning ecosystem Course completion Every learning experience Test results User behaviors Time spent in the course Use of materials after training Limited to LMS Data from LMS, apps, webinars, simulations, and other sources Answers the question “what happened?” Helps analyze “how does the learning process occur?” 6.1 How does xAPI work? xAPI is based on a simple model of logging user experiences. Every action is recorded in the format of: “Someone did something.” For example: Anna completed the onboarding course. Tomasz watched the instructional video. Karolina solved the sales scenario. Michał downloaded the safety procedure. Ewa participated in the webinar. This information is sent to a special data repository called a Learning Record Store (LRS), which can collect data from many different sources, not just a single LMS. 6.2 What data can be collected thanks to xAPI? The greatest advantage of xAPI is the ability to track the entire learning path rather than just activities inside a course. For instance, an organization can analyze: watching training videos, using the knowledge base, downloading documents and procedures, participating in webinars, activity in mobile applications, performing simulations and scenarios, training game results, participation in classroom workshops, implementing onboarding tasks, using supporting materials after training is completed. This makes it possible not only to measure course completion but also to analyze actual learning-related behaviors. 6.3 Why does this matter for Learning Analytics? If the LMS primarily shows what happened in the course, xAPI allows you to observe the entire learning process. The organization can check which materials are most frequently used, which resources employees return to over time, and which activities actually support competency development. This is exactly why xAPI is often seen as one of the foundations of modern e-learning analytics. It allows you to move from simple course completion reporting to the analysis of participants’ actual educational experiences. xAPI Data Examples Activity Example Watching a video User watched 80% video Simulation User selected incorrect response Knowledge base User searched procedure Mobile app User completed microlearning 7. What metrics for e-learning analysis are truly worth monitoring? Metrics available on the LMS platform alone rarely allow you to assess the actual effectiveness of training. Information about course completion, the number of logins, user activity, or quiz scores is valuable, but only combining it with other business data allows you to understand whether the training delivered the expected results. 7.1 In the case of onboarding new employees, the most important metric is the time to reach independence A lot depends on the goal of the training and the organizational area it concerns. For example, if a course was prepared as part of onboarding new employees, the HR department will be interested in more than just whether the participant completed all modules. Much more important information will be the moment when the newly hired person reaches independence and no longer requires constant support from a supervisor or more experienced colleagues. It is this moment that shows when the employee begins to bring full value to the organization. 7.2 In sales, training effectiveness should be evaluated through the lens of business results The analysis of training data in sales looks completely different. Let’s assume that the sales department has completed training on a new product. All indicators available in the LMS look perfect: employees viewed all materials, actively used the knowledge base, completed the training, took part in simulations, and achieved high test scores. At this stage, we can only state that the participants went through the training process. This does not automatically mean, however, that the training was effective. Only combining LMS data analytics with sales results allows you to assess its real impact. Among other things, it is worth checking whether sales representatives offer the new product to customers more often, whether the number of closed deals has increased, whether sales value has improved, and whether employees can use product knowledge during sales calls. Such a comparison of data can lead to very different conclusions. If training activity was high but product sales did not increase, the problem may lie in the training itself, the way knowledge was transferred, or in the sales process. If, on the other hand, the best salespeople achieve high results both in training and in sales, the organization can identify practices that are worth spreading across the entire team. It is also possible to detect individuals who perform well in tests but have difficulty using knowledge in practice, which may point to the need for additional exercises or manager support. 7.3 In customer service, training data should be combined with service quality metrics Similar dependencies can be observed in customer service departments. In this case, training data is worth contrasting with metrics such as average handle time, the number of first-contact resolutions, or customer satisfaction levels. Only combining this information allows you to assess whether the training translated into improved service quality and team efficiency. Learning analytics, therefore, is not about analyzing single metrics in isolation from the context. Its goal is to combine training data with actual business results and find the answer to the most important question: did the training affect the way participants work and the results achieved by the organization? Training area Metrics worth combining with LMS data Onboarding time to reach independence, number of errors made by the new employee, onboarding path completion Compliance level of compliance with procedures, knowledge test results, number of incidents or violations after training Sales sales representatives’ results, number of offers or transactions for a given product, use of product knowledge in customer conversations Customer service customer satisfaction level, ticket resolution time, number of issues resolved at first contact From an analytical perspective, the most valuable metrics are those that can be directly linked to the training goal and the business outcome. The shorter the path between training and a measurable effect, the easier it is to assess the actual value of the development program. Norbert Kulski Head of BI & Automation Solutions | TTMS 8. Artificial intelligence and the future of e-learning analytics Traditional training reports primarily show what has already happened: who completed the course, what score they achieved, how much time they spent in a module, and where they stopped learning. This is important data, but organizations increasingly need something more. They want to know not only what happened, but also what might happen next. This is exactly where AI is beginning to change the way we think about e-learning analytics. Instead of analyzing solely the past, it can help predict risks and point out areas that require support. We describe this in more detail in our article “How to Measure E-learning Training Effectiveness with AI? Every CLO Should Know This”, in which we show why training data should be connected with business goals and competency development. In practice, AI can help answer questions that were previously difficult to capture in standard reports: which participants are likely to drop out of the training, which modules cause the most issues, at which points users make errors most frequently, which competencies require additional support, which groups of employees need a different learning path, whether the training can translate into specific business metrics. 8.1 What does AI bring to training analytics? The greatest value of AI is not that it generates another report. Its strength lies in detecting patterns that a human might not notice right away. If the system sees that participants from a specific department often stop in the same module, achieve lower scores on similar questions, and return to the materials less frequently, this could be a sign that the problem does not lie in engagement, but in the training design or a mismatch in the difficulty level. AI can also support the personalization of learning paths. A participant who struggles with a given topic can receive additional materials, shorter reviews, practical exercises, or an alternative module. On the other hand, someone who quickly mastered the basics does not have to go through all the content at the same pace as the rest of the group. This is particularly important in larger organizations, where a single training path rarely fits all employees. A new hire in onboarding needs different data and support than a sales representative learning about a new product, or an employee undergoing mandatory compliance training. From our experience, AI in e-learning analytics works best when it does not replace human decisions but helps make them faster and based on better data. The system can point out a risk, a pattern, or a competency gap. The ultimate interpretation should still belong to the L&D team, managers, and those responsible for employee development. Traditional Reporting vs. AI-supported E-Learning Analytics Area Traditional Reporting Learning Analytics with AI Data approach Shows what happened in the course Helps predict what might happen next Training completion Informs who completed the course Can indicate who is at risk of dropping out Course issues Shows scores and progress Helps detect modules that cause difficulties Competency gaps Often visible only after test results Can be identified earlier based on behavioral patterns Learning path The same for all participants Can be personalized to the level and needs of the user Human role Analyzes the report after the training is completed Interprets AI recommendations and makes development decisions TTMS Expert Commentary: Today, the concept of AI covers much more than generative artificial intelligence. In the context of learning analytics for e-learning, solutions in the area of data science and machine learning also play a huge role, as they can analyze large datasets and detect relationships that are difficult to notice during traditional report analysis. In practice, this means the ability to identify anomalies and predict problems before they affect the effectiveness of the training program. The system can indicate groups of participants at risk of not completing the course, detect modules that consistently cause difficulty, or identify behavioral patterns pointing to competency gaps. Thanks to this, the organization is not limited to analyzing the past, but can react faster and continuously improve both the training content and the entire process of employee development. Norbert Kulski Head of BI & Automation Solutions | TTMS 9. Summary – Analyzing training data with AI E-learning analytics is much more than analyzing reports from an LMS platform. The mere fact of completing a training course, a high test score, or frequent logins to the system are not yet proof of an effective learning process. The biggest challenge for organizations is moving from measuring activity to measuring the real impact of training on employee behavior and business results. This is precisely why models such as Kirkpatrick, more advanced data collection standards like xAPI, and solutions using artificial intelligence are gaining more importance. From our experience, the most valuable organizations do not only ask “was the training completed?”. They ask much more difficult questions: what did the participants learn, how do they apply this knowledge at work, and does the training contribute to achieving business goals? Data alone does not improve training quality. Value only appears when an organization can translate insights from e-learning data analytics into concrete actions: improving content, changing learning paths, providing better support to participants, and creating more effective development programs. In the coming years, the role of corporate e-learning analytics will likely continue to grow. Thanks to AI, organizations understand better and better not only what happened during training, but also what actions are worth taking to increase learning efficiency and develop employee competencies faster. Key Conclusion What does it mean in practice? Completion rate is not enough Course completion does not mean acquiring competence. Learning Analytics is not reporting The most important thing is to understand the reasons behind participant behaviors. Knowledge does not always translate into action A high test score does not guarantee behavioral change at work. Training data should be combined with business KPIs Only then can the real impact of the training be evaluated. AI helps predict, not just report It becomes possible to detect risks, competency gaps, and training needs earlier. 10. How TTMS helps organizations measure training effectiveness? At TTMS, we help organizations not only create e-learning courses but also better understand whether they actually work. We combine experience in instructional design, data analytics e-learning, and the deployment of AI-based solutions to support companies at every stage of the process: from material preparation, through course publishing, to outcomes analysis. Our solutions allow for the transformation of corporate knowledge, documentation, procedures, and expert materials into online courses, and then the analysis of participant progress, test results, and engagement data. Thanks to this, organizations can quickly notice which content works well, where participants face difficulties, and which areas require additional support. In practice, this means moving from the simple question “did the employee complete the training?” to much more important questions: did they understand the material, can they use the knowledge at work, and does the training support the business goals of the organization? By combining e-learning, LMS data analytics, and AI, we help companies design training programs that do not end with a certificate but realistically support competency development, onboarding, compliance, sales, and customer service. FAQ What is learning analytics for e-learning? Learning analytics for e-learning is the measurement, collection, analysis, and reporting of data about learners and their contexts. Unlike traditional LMS activity reporting—which only tells you what happened (e.g., who completed a course)—learning analytics focuses on understanding why it happened. It connects learning experiences with behavioral changes and business KPIs to continuously optimize the training process and improve organizational performance. What is the difference between LMS reporting and e-learning analytics (Learning Analytics)? LMS reporting lets you see “what happened” (e.g., who completed the course, what the test score was), while advanced e-learning data analytics helps you understand “why” it happened. Thanks to these analytics, you’ll not only find out that participants didn’t complete the training, but also where the problem occurred and how to optimize the content to increase its effectiveness. Is the SCORM standard sufficient for modern e-learning analytics? SCORM is sufficient for tracking basic activity within the LMS platform (completion, test scores). However, to measure the actual impact of training on the organization and track the learning process outside the system (e.g., webinars, working with a knowledge base, simulations), the xAPI standard is essential. It allows for the recording of every employee’s learning experience in one place. How can you link e-learning data to a company’s business results? Effective training process analytics requires aligning data from the LMS with your organization’s specific KPIs—such as sales results, customer service levels, or onboarding time. This transforms training from merely a “requirement” into a measurable tool that supports the company’s actual business goals. How does artificial intelligence support the analysis of training processes? AI is transforming analytics from reactive to predictive. Instead of analyzing only historical data, artificial intelligence can detect behavioral patterns that humans overlook. Among other things, it can identify groups of participants at risk of dropping out, pinpoint skill gaps before they arise, or personalize training paths by tailoring them to the level of difficulty an employee is facing.

Wiktor Janicki

We hereby declare that Transition Technologies MS provides IT services on time, with high quality and in accordance with the signed agreement. We recommend TTMS as a trustworthy and reliable provider of Salesforce IT services.

Julien Guillot Schneider Electric

TTMS has really helped us thorough the years in the field of configuration and management of protection relays with the use of various technologies. I do confirm, that the services provided by TTMS are implemented in a timely manner, in accordance with the agreement and duly.

Let’s talk about how TTMS can help.

Monika Radomska

Sales Manager