AI shifts jobs to judgment as agentic systems mature
A data science leader explains how layered architectures and human oversight curb hallucination risks while reshaping roles
Artificial intelligence (AI) is no longer just changing how people work. It is changing what work is worth. As machines take over more execution tasks, the value of employees is shifting toward judgment, accountability and the ability to operate across disciplines.
That shift is already visible across organizations. Routine documentation, first-pass analysis and coordination are becoming easier to automate. The harder question is what professionals do with the responsibility left behind, particularly when decision-making authority remains with humans.
“The question most people are asking is whether AI will take their job. That is the wrong question. The more useful one is: what does my role look like when AI handles the execution?” said Aashutosh Nema, Principal Data Scientist, Dell Technologies.
“Every role is being redefined around judgment, accountability and the ability to work with other humans in complicated situations. That is true whether you are fifteen years into a career, graduating this summer or still in school.”
Nema said early-career roles are already blending. Product managers are expected to prototype, analysts to handle data science tasks and developers to move across tools. The advantage is no longer narrow specialization but the ability to redesign workflows with embedded AI.
He added that for graduates, depth still matters. A specialist with strong domain knowledge can guide AI systems in ways a generalist cannot. But depth has to be paired with range, so expertise can be applied across adjacent problems.
“It is not expertise that is under pressure at this stage. It is the execution layer wrapped around it. The reporting, the first-pass analysis and the routine documentation are being absorbed fast,” Nema said, adding that professionals need to focus on where their judgment materially changes outcomes.
He said that distinction is critical in high-stakes fields where accountability cannot be delegated. The task is to separate necessary judgment from legacy process, rather than preserving workflows simply because they have always existed.
In practice, this requires actively deciding which parts of a workflow to automate and which to retain. Blind adoption of AI risks creating faster but less accountable systems.
“Leadership in an AI-enabled organization is a different job and most leaders are underestimating how different. The coordination work that used to fill a manager’s week is largely automatable now,” Nema said.
As AI contributes more to output, evaluation shifts toward judgment quality, oversight and the ability to detect errors. Companies are already rethinking performance metrics, team structures and how individual contribution is measured.
Taming hallucination risk
Nema was speaking in an interview with TechJournal.uk about enterprise AI adoption and system design. A central risk in large language models (LLMs) remains hallucination—fluent but incorrect output that can affect downstream decisions.
“A language model predicts the next token given everything before it. It has no concept of truth. When uncertain, it doesn’t go silent. It fills the gap with something that sounds right,” he said.
“But so do we. Ask anyone a question they half-know the answer to. They won’t say ‘I don’t know’ and stop. They’ll reconstruct. They’ll bridge gaps with plausible detail and deliver it with the same confidence as something they’re certain about.”
Rather than relying solely on better models, organizations are redesigning systems to manage risk at the architectural level.
“The good news is that this is becoming a solved problem at the system level even if it remains a work in progress at the model level,” he said. “Hallucination rates have dropped at each of these iterations, not because the base model became perfectly truthful but because the architecture got smarter about not trusting it blindly.”
Nema said modern deployments use layered workflows that combine retrieval, validation and structured inputs. Context engineering ensures models receive relevant information, while memory layers store verified outputs to prevent repeated errors.
This approach is particularly important in enterprise settings, where a single incorrect output can cascade through multiple steps. Checkpoints allow errors to be caught early rather than corrected after they propagate.
He said agentic systems extend this model by distributing tasks across specialized components. Instead of relying on a single response, they create workflows in which different agents perform defined roles and validate each other’s outputs.
“Agentic systems go further. They plan, use tools, verify outputs and loop back when something doesn’t check out. That is a fundamentally different operating model,” he said.
“Multiple specialized agents handle distinct roles such as retrieval, summarization, analysis and quality control. Each handoff is a checkpoint. Errors get caught before they travel downstream.”
Nema added that more advanced systems introduce reflective evaluation, allowing outputs to be reassessed before finalization, while monitoring agents intervene when inconsistencies appear.
This shifts reliability from the model itself to the surrounding system. The model predicts, but the organization determines when that prediction is trustworthy.
LLMs, AGI and NBA
Some AI experts have suggested that LLMs may represent a dead end for AGI, arguing that they lack true inference capability. Despite this debate, LLMs remain central to modern AI systems.
“LLMs are the best general-purpose reasoning primitive we have today. They are not the final product needed for AGI or full autonomy but they definitely are the engine inside the final product,” Nema said.
“Reasoning models now think before they answer. They verify their own outputs. They are trained not just to sound correct but to be provably correct on tasks that can be checked.”
At the same time, expectations around artificial general intelligence (AGI) remain uncertain.
“Current models identify that A follows B. They cannot reliably explain why. That distinction is the entire gap between a sophisticated pattern matcher and something that genuinely understands the world,” Nema said.
“The honest answer is that nobody knows because we don’t fully agree on what AGI means. When definitions shift as progress slows, that tells you something.”
Nema said scaling compute alone is unlikely to close that gap. Progress depends on advances in causal reasoning, memory and embodied learning—areas that remain scientific challenges rather than solved engineering problems.
“We are building increasingly powerful tools. We have not yet built a mind. The distance between those two things is the most important unsolved problem in science,” Nema said.
Nema pointed to Dell Technologies’ Next Best Action (NBA) system, which uses machine learning (ML) trained on historical support cases to recommend real-time troubleshooting steps.
Dell customers can contact technical support agents for hardware and software issues, but troubleshooting can require collecting system logs, reviewing guidance and analyzing past cases to identify root causes. The NBA system is designed to reduce that manual burden.
It suggests actions such as part replacement, software updates or predefined troubleshooting processes, and helps agents navigate large volumes of technical information more efficiently.
Nema has led work on its AI framework, including human-in-the-loop evaluation and a phased agentic AI strategy aligned with business needs, ensuring outputs are validated before execution.
For enterprises, the lesson is clear: AI can automate execution and accelerate workflows, but value still depends on system design, validation and human judgment. Organizations that combine these elements effectively are more likely to scale AI safely and sustainably.



