AI Without the Cost Rethinking Intelligence for a Constrained World
20 Feb 2026 10:00h - 11:00h
AI Without the Cost Rethinking Intelligence for a Constrained World
Session at a glance
Summary
This discussion focused on optimizing AI infrastructure costs and reducing reliance on expensive GPU-based systems through alternative mathematical approaches and software optimization. Bernie Alen from STEM Practice Company, an Oracle partner, led a panel exploring how existing mathematical methods can dramatically reduce computational requirements for AI applications without compromising accuracy or introducing latency.
The panelists identified several key challenges in current AI development. Professor Anshumali Shrivastava from Rice University highlighted the “AI memory wall” problem, where the growth rate of AI model parameters far exceeds hardware capacity improvements, leading to increasingly slower and less accessible models. He emphasized that the next major breakthrough will come from extending context windows in large language models, which currently plateau due to quadratic complexity limitations in attention mechanisms. Shrivastava presented research showing that new mathematical approaches to attention can enable CPU-based systems to outperform GPU clusters for large context windows.
Dr. Kenny Gross from Oracle demonstrated the MSET (Multivariate State Estimation Technique) method, which achieves real-time anomaly detection and predictive maintenance without neural networks or GPUs, delivering accuracy rates comparable to traditional methods at a fraction of the computational cost. The panel discussed a healthcare use case with Tata Group that achieved 100% accuracy using CPU-only infrastructure, representing cost reductions of up to 2,500 times compared to GPU-based solutions.
The discussion also addressed governance challenges, data sovereignty concerns, and the sustainability implications of current AI infrastructure demands. Panelists emphasized that while AI adoption has accelerated rapidly, the industry has skipped traditional software optimization steps that could significantly reduce environmental impact and infrastructure costs. The overall message was that mathematical solutions for efficient AI already exist and should be implemented before further straining global power and cooling resources.
Keypoints
Major Discussion Points:
– AI Infrastructure Optimization and Cost Reduction: The panel emphasized that the current approach of using expensive GPU-based infrastructure for AI applications is unsustainable. They advocated for software optimization methods using existing mathematical techniques to reduce computational complexity, allowing AI to run on CPUs, edge computers, and mobile devices instead of costly GPU clusters.
– Sustainability and Environmental Impact of AI: A significant concern raised was the environmental cost of current AI infrastructure, including excessive power consumption, heat generation, and water usage for cooling. The discussion highlighted the need for more sustainable AI approaches that don’t harm the planet while scaling rapidly.
– Context Window Limitations and the Future of AI: Professor Anshumali presented research showing that the next major challenge in AI development is extending context windows beyond current limitations. He demonstrated that traditional GPU-based approaches hit a plateau due to quadratic complexity, while new mathematical approaches could enable much longer context windows on CPUs.
– Governance, Reliability, and Hallucination Issues: The panel addressed critical concerns about AI reliability, particularly hallucinations in large language models, and the need for deterministic AI in enterprise applications. They discussed governance challenges, data sovereignty, and the importance of auditability in AI systems, especially for critical applications in healthcare, legal, and industrial settings.
– Real-world Implementation and Enterprise Adoption: The discussion covered practical aspects of AI transformation in organizations, including the step-by-step process from identifying use cases to production deployment, the importance of data quality and governance frameworks, and the challenges of moving from general AI tools like ChatGPT to enterprise-specific solutions.
Overall Purpose:
The discussion aimed to educate attendees about alternative approaches to AI development that prioritize efficiency, sustainability, and cost-effectiveness over the current trend of using expensive GPU infrastructure. The panel sought to demonstrate that mathematical optimization techniques can achieve better results at a fraction of the cost while addressing critical issues like reliability, governance, and environmental impact.
Overall Tone:
The tone began as educational and somewhat urgent, with speakers expressing concern about the unsustainable direction of current AI development. Bernie Alen set an activist tone, arguing that the industry is “going mad” and needs to consider alternative approaches before “burning the planet.” As the discussion progressed, the tone became more collaborative and solution-oriented, with panelists sharing practical experiences and technical insights. The atmosphere remained engaging and interactive, with speakers encouraging questions and emphasizing the importance of spreading awareness about these alternative approaches. The tone concluded on a hopeful note, suggesting that sustainable, efficient AI solutions are not only possible but already being implemented successfully.
Speakers
Speakers from the provided list:
– Bernie Alen: Led advanced technologies market development for Oracle; Founder/Leader of STEM Practice Company (Oracle Corporation partner company); Former Oracle Corporation employee
– Kevin Zane: Bernie Alen’s nephew; Student at IIT Madras; Working on complex AI methods with STEM Practice Company
– Kenny Gross: Senior/Distinguished scientist at Oracle; Machine learning technologist; Has approaching 365 patents; Specialist in MSET (Multivariate State Estimation Technique) and electronic prognostics
– Anshumali Shrivastava: Professor at Rice University; Serving on the super intelligence team for META; Expert in dynamic sparsity, attention mechanisms, and long context windows in AI; Mathematics background
– Ayush Gupta: From Genloop company; Working on agentic data analysis platforms and unifying enterprise data universe; Expert in enterprise AI applications and data analysis
– Abhideep Rastogi: From Tata Group; Working on AI transformation processes and enterprise AI implementation; Expert in organizational AI adoption processes
– Participant: Multiple audience members who asked questions during the panel discussion
Additional speakers:
None – all speakers in the transcript were included in the provided speakers names list.
Full session report
This panel discussion, led by Bernie Alen from STEM Practice Company (an Oracle partner), challenged the AI industry’s current trajectory by advocating for mathematical optimisation and software-first approaches over expensive GPU-intensive infrastructure. The session brought together academic researchers, industry practitioners, and enterprise users to address what they characterised as an unsustainable path that prioritises hardware acquisition over proven mathematical techniques.
The Infrastructure Crisis and “Burning the Planet”
Bernie Alen opened with a provocative critique of the AI industry, arguing that competitive fear has led organisations to acquire as many GPUs as possible whilst skipping the software optimisation steps that would normally be standard in large-scale engineering projects. He framed this as both a technical and moral crisis, directly addressing younger participants like his nephew Kevin Zane (an IIT Madras student) by stating that the current generation is “taking that water and that power and the planet from you.”
This environmental framing elevated the discussion beyond cost optimisation to encompass intergenerational responsibility. Kevin emphasised the sustainability crisis from his generation’s perspective, noting the massive power and water consumption of current AI infrastructure. The panel consistently returned to this theme of environmental responsibility as a primary driver for change, not merely a beneficial side effect.
Professor Anshumali Shrivastava from Rice University provided data-driven evidence for what he termed the “AI memory wall”—a fundamental scaling crisis where AI model parameter growth far exceeds hardware capability growth. Using logarithmic plots, he demonstrated that even the most advanced models will feel increasingly slower over time, making current scaling approaches mathematically unsustainable. His key insight was that “everything at the end of the day boils down to efficiency” because prohibitive experimentation costs limit the trial-and-error necessary for AI breakthroughs.
Mathematical Solutions and Technical Alternatives
The panel presented several alternatives to GPU-intensive approaches. Professor Shrivastava discussed breakthrough research on attention mechanisms that can enable CPU-based systems to outperform GPU clusters when context windows exceed certain thresholds. He argued that extended context windows are crucial for complex reasoning, noting that “common sense is very hard in short context for humans, easy in long context.”
Dr. Kenny Gross from Oracle presented MSET (Multivariate State Estimation Technique), a sensor-based anomaly detection methodology that learns correlation patterns between multiple signals rather than relying on individual threshold monitoring. Using a musical analogy, he explained that whilst individual notes might sound acceptable, trained listeners can detect incorrect chord progressions because they understand relationships between notes. Bernie mentioned that in one healthcare use case with Tata Group, this approach achieved a 1/2,500 cost reduction compared to traditional methods whilst running entirely on CPUs.
The panel emphasised that these are not theoretical future possibilities but currently available alternatives that have been overlooked in the rush to deploy AI solutions.
Enterprise Reality and Governance Challenges
The discussion revealed significant gaps between general-purpose AI tools and enterprise requirements. Ayush Gupta from Genloop explained that whilst ChatGPT has democratised AI awareness, it cannot solve most enterprise problems due to data access restrictions and lack of business-specific context understanding.
Abhideep Rastogi from Tata Group outlined a systematic enterprise transformation approach, emphasising the importance of clearly defining AI objectives and implementing proper governance frameworks from the outset. The panel highlighted how regional constraints, particularly in India where AI costs must be dramatically lower than US markets, are driving innovation in efficient solutions.
A philosophical tension emerged around AI reliability. Kevin Zane advocated for deterministic AI approaches to address hallucination problems in production environments, particularly for critical applications like cybersecurity and medicine. However, Professor Shrivastava argued that “LLMs has to be hallucinating because it has to be intelligent and smart,” drawing parallels to human cognition and suggesting that generating novel content requires some probabilistic behaviour.
Global Accessibility and Market Dynamics
Bernie positioned India’s demanding cost requirements as a forcing function for developing globally applicable solutions, arguing that “if we solve them then I think we have wonderful solutions for everywhere else.” The panel noted that India has more mobile devices than people, creating unprecedented sensor data volumes that must be processed efficiently.
This perspective frames constrained markets not as limitations but as testing grounds for the most challenging scalability problems that will eventually affect all markets as AI adoption expands globally.
Education and Future Implications
When addressing concerns about AI’s impact on education, the panelists drew parallels to the historical introduction of calculators. Ayush noted that the key is understanding how much to delegate to AI whilst maintaining core problem-solving capabilities. Professor Shrivastava emphasised that whilst AI might make everyone “10x better,” this becomes the new baseline, and true value comes from solving problems that even enhanced humans cannot solve otherwise.
The panel acknowledged that educational curricula must evolve but noted that specific approaches for maintaining critical thinking skills whilst leveraging AI tools require further development.
Call to Action and Unresolved Challenges
The panel’s central message was that the AI industry can choose a sustainable path through mathematical optimisation rather than continuing the current GPU-intensive trajectory. They presented compelling evidence that dramatic improvements in cost, environmental impact, and performance are achievable today using established mathematical techniques.
Several challenges remain unresolved, including the integration of optimisation techniques with existing AI ecosystems, the development of comprehensive governance frameworks beyond current regulations like GDPR, and the complex relationship between AGI development and emerging technologies like quantum computing.
The speakers concluded with a call for broader education about alternative approaches and invited organisations to participate in comparative evaluations. They positioned their solutions as practical alternatives that can be implemented immediately to address the sustainability crisis whilst improving performance and reducing costs.
This discussion revealed that sustainable AI development requires returning to fundamental mathematical optimisation principles rather than simply pursuing more powerful hardware. The convergence of technical innovation, economic necessity, and environmental responsibility creates a compelling case for adopting these alternative approaches before the current trajectory becomes irreversibly damaging to both technological progress and planetary sustainability.
Session transcript
Can you hear me better? Is this better? Okay. So, infrastructure cost, it’s a very important topic because everybody who is trying to create something in AI, we all know that we are running into having to use extensive infrastructure, right? And mainly it is a GPU -based infrastructure architecture. And last two, three years, I think we are not stopping to ask the questions that we would normally ask. Are we creating these applications in optimized infrastructure? We are just running around getting as many GPUs as possible because we’re all afraid that the other guy would get it and then we’ll be left out, right? So, I think it’s a very important topic. So, there has been an extremely rapid adoption of AI.
and everybody wants to have an AI answer for everything. And so we are not asking the questions that we would normally ask in any project of this scale, right? So we’re going to take a look at what are the optimization methods and good mathematics that’s existed for a long time that we should bring in to optimizing and reducing the computation needs that these models create, that these AI applications create. And if you reduce the complexity and if you reduce the computation, then you don’t need to run a lot of these things on expensive, high heat generating, high failure rate, limited supply. clustered GPUs. You can run this on CPUs. You can run this on clustered CPUs.
You can run it on edge computers. You can even run it on mobile phones and laptops. There is a software optimization step that everybody is skipping, that we would normally not skip in software development. For any large -scale software development, heavy amount of infrastructure optimization goes on. But we are not doing that in deploying these AI models. So we first want to make sure that there is enough understanding of the mechanisms and the methods that are available. And a lot of this is derived from mathematics that has existed forever. So we’re going to talk about that. I’ve got a great panel over here. By the way, just to introduce my company, we are the STEM practice company.
We are an Oracle Corporation partner company. and if you think about I think most people know about the Oracle Corporation, right? I don’t need to introduce the Oracle Corporation. If you think about Oracle, they have had to create solutions and create software and create products for very large customers. They serve the largest customers on the planet. So always they’ve had to worry about optimization, performance improvement and all that because without that infrastructure cost would just be so high. So over decades there has been a collection of intellectual property, collection of ideas, collection of methods to reduce complexity of algorithms, reduce computation and therefore create better infrastructure architectures, right? So STEM Practice Company is an independent company.
We run as an Oracle partner company, but the origins of the STEM Practice Company is within the Oracle Corporation. I led advanced technologies market development for Oracle. and then we separated as a separate company and we launched as a separate company two years ago. Now we operate as an Oracle partner company. Let me introduce the team here. This is a slide that my lawyer says I should show. So just because I paid the lawyer a lot of money to make this one slide up, I’m going to show this slide. Right? Nobody knows where we are going with AI, to be honest. Nobody knows where we are going with quantum. We’re all doing the best to predict what may come, but with any prediction, use your own logic.
That’s what my lawyer wants me to say. So I’ve said it. Okay, let’s go to the next one. So this is my panel, and you may not be able to see their names on the screen. So let me start with the gentleman from the Tata Group. We are a U.S.-based company. We launched two years ago, and we just started working. with our India operations, India opportunities, and we had the great fortune to start with a Tata company. And I think they are quite happy with what we have shown because some people say, hey, if you’re not using GPUs, not using expensive infrastructure, is there a compromise? Am I introducing more latency? Am I creating less confident output?
None of that. In fact, all of that gets better. And we were able to demonstrate that with the opportunity that we got working with Tata that we were able to show that we are getting 100 % accuracy and we have not used any GPUs at all in the infrastructure that we have proposed. Right? Okay. So that is Mr. Abhideep on the back with Tata. Say hi, wave. Okay. And the gentleman next to him is a part of a STEM practice company. He is from Oracle. I did steal him from Oracle. Oracle because it was essential to at least steal some people. before Oracle gets pissed off at me. So Kenny Gross is the senior scientist at Oracle, distinguished scientist at Oracle.
He has a patent for every day of the year. So his patent count is approaching 365. So that’s Kenny Gross, a master machine learning technologist. And next to him is a professor from Rice University, Anshu. Also serving on the super intelligence team for MEDA. And Anshu is very passionate about, as a professor, right? Professors are usually passionate. My father is a professor, right? So that’s why he lives here and I live in the U.S. I live that far away from him, okay? Just because that’s the way I can deal with this passion. But very passionate about, hey, all of these methods, these methods to do things better have existed. so let’s make sure that we are bringing those methods creating awareness for those methods and he’s going to talk about how he sees what the challenges are going to be and how we already have methods to go address the challenges that are coming up by the way this panel is going to be very interesting so all of you all can start texting two or three of your friends to start showing up here so we can spread the word more and next to him is somebody you all may already know it’s one of the top successful companies that is working with the foundations of AI that is shaping up in India it’s Ayush from Genloop and he will talk about it in a very much of an Indian context because what is he has a front row seat to everything that is going on and then the last person on the panel is this who I’m most proud of because he’s my nephew, and he is in IIT.
I went to Bitspalani, so that part I’m not proud of, that he goes to IIT and doesn’t go to Bits, right? By choice, right? So, and he is in IIT Madras, and he is working closely with us and learning deeply how to build some of these very complex AI methods up front, right? Okay, so we are a small enough team here, so always feel free to interrupt, raise your hand, come up, ask questions. The goal is to educate because I think we are going very fast, and we are spending a lot, and we are creating problems, like we need more power generation, and we need more power generation rapidly, and, because of that, we are, causing harm to the planet.
So it is good. You know, all this mathematics has existed for a long time, but it’s never been productized because there’s never been a market. But now there’s a phenomenal market for this, right? Because mathematicians are poor people, right? We have a paper and a pencil most of the time, right? But now we can productize it and we can bring these solutions to the market before we end up burning the planet. Right? Okay. So let’s go to… So these are what people are going to talk about as Professor Anshumali is going to talk about the problems that are coming up and how we have the solutions, demonstrated solutions, benchmark solutions to address the problems that are coming up.
Dr. Kenny Gross is going to talk about doing a large amount of real -time AI and… stream -based AI without using any neural networks even. Not just without using GPUs, but without using neural networks and getting a very high level of accuracy, very low rate of false warnings at a tiny fraction of the cost that everybody else is spending. Okay? And then we’ll have questions for the panel and then we’ll have questions for the audience. Right? But we have no problems in making this collaborative. So sometimes a question can’t wait till the end. So just raise your hand, ask a question, and we’ll talk about it. Okay. Now I’m going to turn it over to Professor Anshu to come here and talk about how we are already ready to address
the challenges that are coming up. Right? Thank you very much. Can you guys hear this? Okay. so I’m pretty sure you must all have heard of we need AI without the cost the cost is too much right there’s never enough GPUs how many of you have heard about solutions and how many of you have heard about yeah this is an idea that will definitely go into work or at least there is some merit to these ideas I think we are going to go into that so I think the first part that we need AI without cost is kind of obvious I’m not going to rant about it though I will talk about something that motivates why the problem that you are going to see is just going to get worse so I don’t know if you can see the plots here but what are these on the x axis is the year and on the y axis here is the parameter count of the LLMs now you see kind of two interpolated straight line the green ones is the amount of memory available in the GPUs right h hundreds a hundreds so on and so forth and the red ones are the memory or the model parameter count for the demand right that’s the models like gpt3 g chart switch transformer megatron etc what do we see here that the rate of growth of hardware and by the way it’s on logarithmic scale so it’s exponential the rate of growth of hardware is nowhere close to the rate of growth of demand the other plot that you cannot see is kind of similar but it’s in compute so it’s like in teraflops or petaflops what gpus can offer and what we need to reach to certain latency this was a famous paper from berkeley that says ai memory wall and what you should expect is if you are hoping that your latency will become better with algebraic l l m that’s not happening unless there is some breakthrough okay so models will get bigger, but they will not be able to cope up with the even the GPU growth, which means models will feel slower, the better models will feel slower and unaccessible, inachievable, right?
I mean, there are many models, I’m pretty sure you cannot even run in whatever infrastructure you have, this is going to get worse. That’s what this plot kind of says that. So clearly, there is a need for what we are talking here, right? So again, a little bit on the past work, one idea that was that is very popular. And basically, I’ve been working on it since 2016. And it’s kind of now catching up as a mainstream. One idea was why do we do full computations, let’s do sparse computation, and it’s not static sparsity, it’s dynamic sparsity. What is dynamic sparsity? Well, I need all the parameters. So I’m not throwing away, I’m not going against scaling laws.
But I will pick which ones I need based on my input or dynamically. And that is called dynamic sparsity, right? So So I’ve shown you two cartoons here. The traditional model is you do all the computation, which is what GPUs were built for, right? And the argument now is, well, you don’t do all the computation. You only do what is needed. But then GPUs are not kind of quite built for. But there is a sweet spot in between. You can do block sparsity something and get things around, which is what mixture of experts is, right? So mixture of expert is now the de facto way of training large language models. So one idea is obviously there.
But remember, the fundamental kernel of GPUs were always built for full matrix multiplication. And mixture of expert was kind of a bandage that seemed to work. But obviously, we need a lot more, as we have seen. So let’s take a pause here. We all have seen the evolution of model, right? Getting a foundation model, large parameter models with large capability. Where is this all going? What is the next race? and I want to argue here that the next phase is context window why? what is a context window? everybody is familiar with what a context window of an LLM is? it’s kind of a working memory right? so let’s say if I want to solve a simple problem like 2 plus 2 equals 2 that only requires a very simple context but let’s say I want to solve an Olympiad problem so you are asking me to prove a theorem and I generate 40 intermediate theorems I need to have all the theorem in my context to go to the next theorem otherwise if I miss any of the theorem if it goes out of context I cannot prove things so more context window means I can process more information correlate across and make decisions so complex workflows will start to happen when the context window grows and that is what we have seen with GPT’s right?
GPT 3 came up small context window as the context window is growing now we know cloud code kind of works because it has like what 200k context window or something right and even then I don’t know how many of you have experienced that you have to compactify the context because you run out of context window, right? What this plot shows is on the x axis, I don’t know if you can see it, is the ear and on the y axis is the context window. What do we see here? Almost a flat plateau after a while. And by the way, that’s also experimental. 10 million context window is experimental. The closest is 1 million. That is, you can achieve and play with it.
But it has plateaued. People are not talking about 100 million context window and more. And it is very clear to people that more complicated task means more complicated context window. And we believe this is what the next race is. At least I am very much bully on that this is what the next race is. You want to do complex automation, very complex automation, right? We talk about like building agentic workflows and all that. But I believe we are underestimating how much complex automation we want to do. And we believe that we are underestimating how much complex common sense is, right? Common sense workflows requires a lot of reasoning. and it will not happen unless we have large context windows.
But large context windows are plateauing and we are talking about some of the frontier models. So let me tell you what the current problem is. The mindset is, okay, the kernel remains the same, which is full matrix multiplication. Let’s apply bandages like mixture of expert and whatever, stretch as much as we can and see where we go. That’s strategy number one. That’s probably one strategy that we know of, seem to work, but has plateaued. We’ve seen in the previous plot, it has plateaued. What I am bully on is, we have to rethink to break that plateau. Okay. And again, like I’m not going to go very technical. This is an upcoming paper in ICLR. But I want to argue there is a new math, a new way of doing attention.
Again, I’m not going to start uttering words like sharpened softmax and like, like exponentiated all that. I’m going to start uttering words like you can read the paper it’s coming in iClear this year it’s going to be presented in Brazil in this summer but what we have shown is that if you change the math of attention then there is something which gives you the same capability but in a different cost so it’s changing the math rethinking the math like dynamic sparsity right it’s some sort of a sketched way of estimating things what is interesting is we have experimented this so if you see this plot on the x axis is the context window the y axis is the latency token the time to first tokens or token per second the two red plots are the best attention mechanism flash attention 2 flash attention 3 on the best possible hardware GH200 and the green one is actually the new math on a CPU now what is interesting is if the context window is below 131000 GPUs are obviously faster which makes sense but But as I go beyond that, the CPUs dominate.
And actually, it’s not the CPU. It’s the algorithm. And the reason is context windows scale quadratically in attention. So you can throw as much hardware as you want, but you cannot beat quadratic complexity. Right? You are throwing linear number of GPUs to tackle something that goes quadratically. So something goes like 10, 10 square, 100, 100 square, and you are just doubling things. That’s not going to work. That is what this kind of plot shows. It says something fundamental. So what we are trying, and again, this is what I argue. I’m not going to bore you with the math. But what we are trying to argue is the hope is, and remember the title of the talk, how.
The how part is the rethink. We have to rethink beyond how attention is done. Because in the current race, if you have 1 ,000 GPUs, if you have 10 ,000 GPUs, you are 10x ahead of that person. but that race is plateauing because of the quadratic complexity. So yes, you will always be ahead because you have more GPUs but not very far ahead. But if we change the math, then we can actually break that plateau and I believe we can unlock capabilities of the next level. We will see automation that hopefully we expect is possible. Again, I will say parameter count and benchmark hacking. We have seen it enough. We want to now see complex tasks happening.
And it is my belief, again, I am an academic, so one of the things as an academic, you get to ask hard questions and you can think about it for a very long time. So for me, the next race is can I break the barrier of how much complex tasks we can solve with the LLMs using this context window. And I believe if we can make progress there, that’s a very tangible real progress. so can we go to 100 million contacts faster than others I think we can and with that I would stop my
So the energy savings come from the three orders of magnitude lower compute costs. We’ve done four presentations with NVIDIA GTC conferences demonstrating with real data the reduction in compute costs. And then the other aspect that I wanted to mention in terms of data centers is the prognostic for avoiding downtime in servers and chips, CPUs and GPUs. We developed and published long ago a few dozen publications on the new AI MSET. MSET 3 is capable of detecting all mechanisms that cause CPUs and GPUs to fail. In data centers days and often weeks in advance of failure. This avoids downtime. Now in prior data centers five years ago. The downtime wasn’t a big deal because if you’re just doing web serving applications or even database applications, there’s a lot of horizontal redundancy.
With the new AI workloads, though, when a company is running a five -day training run with their LLM, system board failures are very costly for that. And one spinoff of MSET for data center applications is called electronic prognostics, where we’re able to detect all mechanisms that lead to failures of chips and system boards in the data centers. And the final point that I wanted to make with that bottom bullet there is I got data. What we always tell other industries, and MSET has been used in locomotives, wind farms, all aspects of utilities. All defense aspects, land, air, sea, and space. But if any company, whatever system that you’re using now, if you have data, historian data, we welcome doing a blind bake -off with your own data.
And whatever technique you’re using now, third -party commercial technique or homegrown technique, we’ll be happy to demonstrate with your own data in a bake -off where the winning criteria for the bake -off is lowest compute cost, earliest detection of incipient anomalies in the assets, and the lowest false alarm and missed alarm probabilities. With conventional approaches, it’s the false alarms that cause a lot of losses from shutting down assets unnecessarily, revenue -generating assets, and they’re not broken. And missed alarms can be catastrophic. And in most cases, they’re not broken. Life -critical industries. they can be extra catastrophic. So that’s an overview for our AI MSEP. I’ll turn it over now.
Okay. Thank you, Kenny. So in one of the use cases where we used the MSEP method to process anomaly detection, the cost of running the use case was 1 over 2 ,500. So it’s not just a 10x reduction or a 20x reduction. You’re talking about a reduction of 2 ,500 times, right? So that’s the power of these kinds of protocols. And so certainly before you guys start implementing… And whatever AI method you’re using, whatever solution you’re going after, educate yourself on these kinds of methods that exist. Right. Feel free to reach out to us. And not everything, you know, you need to go through a massive GPU cluster to be solved. Right. OK. So we’re going to go to the panel now and ask the panel some questions.
Questions. So I’m going to start with all the all the panelists over here. Right. And we can first talk about. Things have never been this crazy. Right. I mean, I think the last two years, two and a half years, the world has kind of gone mad in some ways. Because everybody is chasing this and everybody feels a sense of great urgency to chase this. Right. So how do you all see this? How do you all see this? I mean, I think there are a lot of challenges in AI. Maybe we’ll start with you, Abby, and then we’ll come down. closer to me.
Sure. So what I’ve observed that in the recent past, if I take an example of the last two to three years, we started with the process of like Gen -H chatbot. That was a very big thing at that point. Now I can see the trend that everything is converting from Gen -H chatbot towards I would say a workflow automation where agent tech AI and agents are running on an executive level as well as on an enterprise tools where it is already executing the proper workflow which is supposed to be handled by a person or any particular code something like that. So it’s been automation which I can see in the current organization even when I talk to other clients also, they are also looking forward for these kind of things.
That’s my understanding. on this.
Kenny? Kenny, you want to comment on the same thing? What are the challenges that you’re seeing and how we are doing things now and how fast we are going? What is your prediction for what’s coming our way?
One of the early challenges with MSAT pattern recognition was getting the sensor signals out of the asset to a central location. And that challenge has been solved now for most industries and certainly for data center industries. The challenge in the early days when the two biggest locomotive manufacturers in the United States licensed MSAT, they had to bring a computer on the train to monitor the signals because there were not good techniques for offloading the signals from a locomotive. Now there are good wireless networks for bringing the sensors. And back to the data center. Thank you. we developed at Sun Microsystems computer system telemetry that picks up all the signals from all sensors and processes inside servers.
Voltages, temperatures, currents, fan speeds, in many cases vibrations are in the servers also. Thousands of variables. And we’ve made a very lightweight harness that doesn’t interfere with the customer’s compute capacity at all. It runs on the system processor, brings the telemetry out. So that challenge has been solved. And now with the latest GPU servers, there is a commercial system, Prometheus. And on December 15th, NVIDIA released freeware telemetry for all their servers and clusters. So that challenge has been solved. And we at STEM can show you how. to stream the signals from any asset, airplane engines, any asset, autonomous vehicles, into the compute box that is lightweight on CPUs, not GPUs, that gives real -time prognostics with early warning of incipient problems, not a high -low threshold.
That’s what they use now. By the time that something hits a high threshold, something is already severely wrong or the system crashed before it ever got to the threshold. We are able to detect the onset of anomalies below the noise floor. They’re in chaotic noise, and MSAT’s able to detect the onset of those. So that would be the challenge. If somebody doesn’t have sensors in their assets, they’re going to have to wait until next year’s model and put sensors in. But most assets now have lots of sensors. But do not have a good technique to… consume that data and give prognostics without having to train somebody to get a master’s degree. It works out of the box.
We hook up the sensor signals to the M -SET and get early warning enunciation of anomalies. And the energy savings is very significant because the control algorithms now have highly reliable signals going into them. M -SET’s the only technique that can disambiguate between sensor problems and problems in the assets. And so the control algorithms are using fully validated signals, and it’s much more efficient operation. And if anything starts to go wrong in the assets, you get an early warning of that.
Thank you, Kenny. Anshu, what’s your take?
So, I mean, again, as I already said, I’m very bully on long context. Let me give you an example, right? So By this time, we all know chess is easy, math is easy, programming is easy, right? I think common sense is very hard. I think common sense is very hard in short context for humans, easy in long context. So if I keep talking with you over a period of time, no matter how much I think or not, I’ll figure it out that you’re bored. It will take me some time, but I’ll figure it out. So over a long context, you need long context to figure that out. And I think machines are right now gaining context, but they are gaining it quadratically, which is what I talked about.
So I believe right now the biggest complaint in enterprises are agents do not have common sense. They hallucinate. They are not 99 % agent. They are like 50, 60 % agent. To go from 50, 60 to 99. you need that constant you are working with a human and over a period of time you figure out damn this guy needs this and that will happen when we will have really long context so I will just double down on what I just said I think it’s the next thing is efficiency and long context
very good and what do you think you are having a front row seat with everything that’s going on around here so you have not only the large context you have the relevant context too so tell me
first of all thanks for the question and very good evening to all who have joined so we are in the space of unifying the entire data universe of an enterprise and providing an agentic data analysis platform so what that means is a normal business user who so far was used to just static dashboards can come on a system and have conversations get proactive insights and do better decision making faster so the most exciting part in that context for us is how can proactive decision making and the right quality of insights help improve in enterprises, top line, bottom line, efficiencies, etc. For instance, we have increasingly seen that the need for the warehouses, data warehouses, the big data warehouses and ATL pipelines that so far were required to be maintained will go down in future because so far everything had to come to a single source of truth table from where human analysts could actually query and get insights or power these power BI dashboards.
But now with agentic analysis, when they can connect with different data sources, different modalities, not just tables or PDFs, but also like images, presentations, documents, etc. you might not have the need to create multiple replicas copies and versions of the data set the bronze table silver tables gold tables etc you might just want to connect to those native systems of records directly and get the insights required we have seen that happening with a lot of our enterprise customers that they are able to see value when the agentic analysis is able to give their business users very good insights so that is the most exciting part for me how can we have data analysis give ROI to an enterprise and the challenge for that is exactly quality and reliability so how do you make sure those insights are of quality they are not just hey the sales are down but it is more about why are the sales down what are the next steps that you can take to fix them if you have not been able to achieve your incentives in your store or your targets in the store what is going wrong what are the other stores doing that you could learn from that and then do better and the other is a reliability of insights.
Like it’s not just getting it right 1 out of 10 times. It’s getting it right 10 out of 10 times. Even with questions that are less known or unseen and unlock value. And lastly I touched on the ROI point and that is where there is synergy with what STEM is doing. In US it’s still fine to charge roughly a dollar for one kind of an insight. If I do a rough mathematic that still comes out to be a decent enough ROI. When you are paying $125 ,000 to a data analyst for same insights. Like in case you have to hire one. But in India the cost has to come down even further. Like it has to be probably 1 rupee per conversation to actually unlock the same quality of insights.
And the major cost driver is the GPU. Like how do you have cheaper inference and that is where I’m excited about what you guys are doing at STEM. Like we are hosting our own models many a times. We are also one of the companies training SLMs to power this use case. So the exciting thing about us is can we have an alternate architecture that scales and gives us a very cheap cost of inference so that we can give the same technology at a much scalable use.
Very good. And I want to, before I go to you, I want to say that that’s why I think a lot of these solutions can be perfected in India. Because India is going to throw the toughest problems at us. We’ve got to solve these at a massive scale. India has more people than anywhere else. Everybody knows that. But India also has more mobile devices than India has people. So… And you talk about sensors. tens to hundreds of thousands of sensors all coming in from a very large population and you are telling me I need to give it to you for a rupee. Right? So India is going to throw the toughest problems at us and as we…
I saw this somewhere that it built in India but for the world. So I think if we solve them then I think we have wonderful solutions for everywhere else. Right? Okay. So what excites you? And you just got into IIT Madras and you are doing well. Thank you for that. Even though you didn’t go to bits like I asked you to. So go ahead and talk about what is exciting to you and where do you see the challenges are.
I think I’ll start with the challenges in this one. The challenge I’m going to talk about is sustainability of AI because that’s something that’s grown increasingly relevant as of late and as Anshu here said. Well, the challenge I’m going to talk about now is the sustainability of AI because, well, as Anshu here said, we’re rapidly approaching a hard limit on how scalable GPU -based infrastructure is and with the very large impact on the environment, on water and power and the amount that is required to fuel these GPU server stacks, I’m excited mostly for what STEM is doing for STEM’s ability to use better algorithms to increase AI’s efficiency, increase its speed and increase all of that without having to take up massive amounts of power, massive amounts of water and damage the planet in the process.
Very good. We are taking that water and that power and the planet from you. That’s the key point, right? Not mean we are all been around, but that’s the thing. We are taking, this is very important, right? We are taking… We are… By using this expensive infrastructure and this infrastructure that creates other high costs, like I need more power and I will generate more heat and therefore I need to be cooled down and the cooling needs more power and we need all power plants and everything will break down because none of these are very reliable systems. We need to be very careful about what we are doing to the planet in doing this so fast and believing that this is the only method that is out there.
So it’s a very high responsibility and a high burden on everyone to understand these other methods that exist. These are good mathematics so that the software can reduce the hardware requirements. That’s the sustainable method that’s out there. That’s the responsible method that’s out there. Okay. Let’s go to the next question. So it’s about process. So maybe we’ll start. I’m going to start with you there, Abby, from Tata, right? So once we know what to do, how do you take an organization through that change of going from manual processes to automation and automation of decision making, right? Which is what autonomous nature comes in and artificial intelligence comes in. And we got to address what it means for people who were so scared about job loss and everything else, right?
So what is talk about the process?
So in our organization, what we do is specifically we have follow multiple stages. So if we talk about any use cases coming to us, anyone is asking that we wanted to perform certain tasks through an AI. It’s a very broad term, right? So we start with the stage zero, like what’s your aim of using AI? Is it cost reduction? Is it revenue? Or is it something that you want for customer experience? Once we finalize that, then we come into a stage one where your AI mapping to an opportunity that you have been handling that. OK, I’m interested in revenue generation, so it will be attaching to a finance department and how finance application will be useful for that.
So that’s where all stages will come into picture. Once you finalize the stage one, the next stage is what about your data? Data. That is the critical part of our journey and the transformation that where is your data is. Is your data have a quality? Is the data quality existing data lineage? And what are the sources of the data? Is it legacy data? Is it something which is cleaned or is need to be transformed into a clean data which we are looking forward into that? So that will be a big picture where the data part comes into. once you have the data all your alignment is done then next stage comes into as a period of what’s your architecture strategy under that it’s a big umbrella like first we have to under architecture you have to finalize what’s your deployment strategy are you looking for a GPU are you looking for a CPU and then what type of deployment are you planning is it on premises if it is a hyperscaler and then once you finalize the deployment then you comes into model are you looking for SLM are you looking for LLM or what other things need to be done once your stage is done then where are you going to host the model into once your architecture is finalized then your computer also will come into picture what’s your computer strategy you are looking for are you going to run on virtual CPUs or is it something that you can run in your local system also So, depend use case to use cases, right?
Once you have done that, what we prefer is to have a pilot execution where we will get to know what’s my accuracy is, what’s my ROI can be estimated and using this particular use case, how I’m going to achieve a particular target. Once this is done, your governance into coming to picture the next stage where you will be having some guardrails, and what’s your policies, if there’s any GDPR compliances are there, or if there is anything like maybe a HIPAA where your healthcare is concerned, right? Once your governance is finalized, then you are going to finalize into a platformization from a POC to a productionized to enterprise level deployment where you will be having all your sorted.
So, you have all the details which you are performing going to do. and you will be ready to go live with the AI transformation for that. So these are the stages which we usually follow. But next stage is what we internally do follow is for your employees, how you are going to learn what we did. So that is more important because in future, this will keep on coming up as a new use case or something. So you have a background so that you have a better alignment to that. So this is how we usually follow that about the transformation in our organization.
Very good. I’m going to take that and I’m going to segue into what I wanted to ask you, Ayush. He talked about governance, right? And I don’t think we have completely cracked the code. We don’t have a code on that, right? Because one of the questions we wanted to ask you was… given what you’re doing as general thinking there is. At one point, AI was synonymous to chat GPT. Outside of our technicals, if you talk to a doctor or a lawyer, they say, oh, I’m using AI. What are you using? I’m using chat GPT, right? And especially those two professions I mentioned, we’re very concerned about governance, right? So if you can talk more about that aspect of, because when you take the models to the end user, why is it not all just chat GPT?
And is it governable if you have these big, large, open source models and whatever you’re building on top of this, at the end of the day, that intelligence that’s been created, is it governable?
So chat GPT definitely has been very instrumental in democratizing AI and has become a symbol of what AI means in the new world. So I’m going to credits to them for that. But definitely for an enterprise, a chat GP does not solve majority of the problems. It could be good for some lazy tasks like email writing or some personal plannings etc. But in an enterprise when it is about taking real decisions or even when doing some actions like I’m in the customer success team and I want to create a presentation for my customer around their usage in the last month and the issues that they had and how much time we took to kind of solve them.
This is something that cannot be done on chat GPT. For two reasons one, it does not know your enterprise data. You cannot connect all your know -hows to systems like open AI because somewhere or the other like open AI, entropics they are all tracking what kind of activity is happening on top of their APIs and then planning what would be the next expansion as an application. We’ve seen that with cursor use case like coding use case transitioning into a codex from open AI and a cloud code from entropics. So what is that you never connect your enterprise data because of the compliances, privacies and tying back to the data governance aspect of it. Second is the context.
Now what separates a steel company in US, Texas versus a steel company in another region in US or any one company from other even in the same vertical is the context is how is the culture of doing business, what are the KPIs, how are the processes set up, what actions do they take to actually doing an RCA, what are the decision making activities. That is basically the core of the business. That core of the business is not known to systems like chat GPT or clouds for two reasons again. One, they don’t know that process. The data is not explained. They don’t know how to do it. They don’t know how to do it. They don’t know how to do it.
They don’t know how to do it. Second, they are very general APIs, stateless APIs that will never be able to understand those nuances without learning. So those are the things that, you know, become the reality of enterprises and those are the things that, you know, chat GPT’s are not solving the real enterprise problem because of the context and, you know, the understanding of the business itself.
Very good. So, leading from that, Kevin, what I would ask you is what he said was, you know, a large enterprise context needs to be understood by open source models and there’s a responsible way to do that, right? You cannot just release all enterprise information to the public. But he also said that we need to have things like root cause analysis that needs to be done, which leads to deterministic AI, right? So if you talk about deterministic AI and also talk about the sovereignty aspect that he talked about. Which is that we need to create. We may be using public domain models where it makes sense, but we need to do it in such a way that the data is completely sovereign.
Go ahead. Talk about it.
See, deterministic AI is a solution to a very specific problem with most modern large language models, which is that they’re quintessentially probabilistic. You can give a chat GPT a prompt twice and you will get a different result. Chat GPT also has the capability to just make stuff up. And it is not bound to fact. It is not bound to a stringent set of rules. And the issue with that is that it’s great. It’s great when you want to generate a picture of a cat on the Eiffel Tower or write a Shakespearean ballad. But if you need to apply it in production. Production content. then hallucinations and false data is not something that you can afford to have in those kinds of situations, say cyber security or the medical field and that’s the very specific problem that we use deterministic AI to solve, right.
It’s at its core it’s an architectural response to this problem, we don’t eliminate machine learning entirely, we just bind it within a very set system and a set of rules, right. Objective isn’t like open ended generation but controlled and audible execution. So generally I would say there’s a few principles, very core principles this sort of approach, right. Your system has to be predictable as in your responses must give the same output for the same input. Right. Because that directly leads to auditability. Which is a very difficult thing to do. Maintaining intelligence.
once before. We are all playing around with creating intelligence, but truly it’s been done once before. Whatever faith you all believe in, it’s been done once before. And what were we all told? You know, you have your free will to go you know, so it’s, once you created intelligence, putting it in a box is a very, very difficult thing to do. Right? So, but then if you cannot put it in a box, how can you have a governance function? At some point it’s going to say something that’s going to embarrass your customer how can you have a governance function? Some thoughts?
So there are a couple of rules we need to apply. That’s what I can think of at this point, like there are a couple of rules in terms of GDPR, DPDPI is coming into picture for India specifically and that’s where we follow those rules, that if that is compatible, we apply those. If not, then we may have to think about from the policy … on the other side of the company, right? If the word will be applicable or not. There are a couple of scenarios where your PII data is, to be very frank, it doesn’t matter in India much. But if you’re part of in US or somewhere else, it does matter. So we have to take care of those scenarios when we are implementing.
So at our organization, we have to make sure that we are following all the compatible policies, making sure all the guardrails are in place. So that’s where it’s my understanding.
Kenny, any thoughts on governance? I know that you deal with sensor data, which comes from measured things, not made up stuff for the most part, unless the sensor itself is showing some biology, right? When the sensor misbehaves, what do we call it? We call it sensor biology. You see how we blame the human race for that, but anyway. So from that point of view of governance, you live in a, I think, less complex space than people who are making user content, right? But what are your thoughts on governance?
One of the biggest challenges for governance is for applications where there is human -in -the -loop supervisory control of complex processes and systems. And this challenge, and it’s turned into this year the biggest challenge for defense AI, called situational awareness. With situational awareness, you can have a highly trained human operating a ship or an airplane, and if there are false alarms in the process. And that’s a problem. We talked about with the chat GPT, the hallucinations. And so forth. In physical systems, it’s false alarms on sensors. And I keep going back to the false alarm rate because the number of sensors, if you go from six sensors to 600 sensors, the probability of false alarms multiplies up with that to 50 ,000 sensors.
And so you have a pilot of an airplane has been highly trained for every situation. And when they test the pilots to give them their pilot’s license in the big simulators, they throw in a second problem when the pilot’s dealing with the first problem. The challenge from false alarms is you can have the most highly trained human. And if red lights are going off at different places from false alarms, the human gets to the point of cognitive overload. stupid mistakes and this is long before any hallucinations out of AI and just one example of that that I’m not talking out of school and giving away secret information the US Navy in the last five years has had three spectacular accidents in broad daylight with the latest instrumentation on on ships where they would run into a big oil barge or a fizzing fishing vessel hundred million dollar accidents and some of them are resulting in loss of lives well the human bridge watchers they’re called they’re in a sophisticated control room that if you imagine the cockpit of a 777 multiply that by a hundred you have highly trained humans watching all these signals and and if too many things are happening and if too many false alarms are happening the human gets mixed up, gets to cognitive overload.
We’ve published a half a dozen papers in the international cognitive science conferences around the world and demonstrated how MSET is able to eliminate that process for monitoring complex processes where a human has to make decisions. And the one technical point I’ll make, and this is in a lot of our journal articles, MSET has the lowest mathematically possible false alarm and missed alarm probability for
So Anshu, so we’ve collected all the requirements, right, in this conversation. Now we’re going to give them all to you to actually solve them, right? So because we’ve said that there’s a step -by -step process to doing all of this stuff, that a sensor explosion, right, that is sovereignty and RCA type requirements and there is user content and data explosion, all of this stuff. Finally, when we map it all to where is the compute to go do all this, because a lot of these algorithms are complex algorithms, right? And where is the compute to go do all this? So current methods are not taking us there, right?
So I’ll just add one thing here. Look, I mean, if you look at the progression of AI, everything is still one of the most powerful method in the humankind. It’s trial and error. Right? How do I know that prompt engineering works? I mean, if anybody has worked on prompt engineering, you keep trying and at some point of time you suddenly see it solves 80 % of the problem. That’s a good prompt and then you hill climb from there. The whole AI is about right now we are dealing with a new entity, a new species. We are trying to co -live with them and we don’t understand them. it’s not very different it’s just like my brain right sometimes it works maybe on tuesday it doesn’t work because of whatever my schedule is but i have learned over a period of time to live with it i think we are asking some very important question about governance guardrails all of that we will i think solve a lot of them with trial and error but the most important thing is trial and error should be regretless if to do a million tries i’m burning like hundreds of millions of dollars i would be careful so i will still say the biggest hurdle in the advancement is the ease at which i can trial and error and experiment with them and the ease is directly proportional to how much energy we are burning how much money we are paying imagine if compute was free imagine i give you the best model and i give you as many queries as you want and now imagine the hardest problem you are facing governance, accuracy I am pretty sure if you sit down and hill climb make 10 agents, let them talk with each other cloud bot, figure out some strategies you go on a dinner, maybe sleep overnight and these guys keep talking, all of them the most expensive model running at the highest possible latency I think you will make remarkable progress but you won’t be allowed to do that and that is why I will come back again and this is why this panel to me is very important because everything at the end of the day boils down to efficiency it’s like raising the tide because it raises all these boats all interesting problems will be solved if you are allowed enough trial and error that’s what my belief is
and that’s the thing the title of the program the title of the panel is Constrain the World right We can’t just all mint money. I’ve tried that. It doesn’t work. You know? So, and it’s a constrained world, right? So, how do we solve this problem? This is the largest conference probably ever. This is not a conference. This is AI Olympics. Okay? Largest conference ever. People are talking about like 700 ,000 people. This is the kind of scale we need to solve. Right? Think about it in the AI space. Every day in an AI space, it’s going to be this busy and this heavy and this crowded and this much of data, etc., etc. We can’t be throwing expensive infrastructure all the time to solve the problem.
We got to get better. We got to understand all of these other methods exist and implement those methods and have sustainable. Sustainable AI. Right? So, questions from the panel? Everybody? there’s enough time for all of you to ask at least one question. How about that? There’s a lot of time, so ask a question. What are some of the some, or you just have an opinion, just have an input. That is fine. Go for it. Who wants to go first?
There is this trend of, you know, AI will solve everything. It’s coming into the picture. And you talked about hallucination, and I see a lot of the engineering meaning, whether it is automotive, ship, aircraft, naval, it’s not always, solution is not always probabilistic. You know, it is also binary. You know, sensors give zero or one, so you need to decide. So, applying this in the real… lot of it in the real engineering world wherein we have to be deterministic to be safe you know you said MSAT could solve all this problem but if you could demystify MSAT for me that would be you know great
oh yes the best way to demystify MSAT in the way that it works is the conventional approach for monitoring signals from an asset let’s say from an automobile or a locomotive is to monitor put high low limits on each variable if the engine gets too hot a red light comes on the dashboard if the fan gets a bad bearing in it and it doesn’t go fast enough that will cause a problem and The coolant can get too hot. Pressures get too high. RPMs get too low. This has been the conventional approach for decades, high -low limits on thresholds. The problem that will never go away with putting high -low limits on individual signals, it’s called univariate monitoring, is when you’re monitoring noisy physics processes, if you want to get an earlier warning about a small developing problem, you reduce the thresholds.
But then spurious data values will trip the thresholds, and you’re shutting down a locomotive in the middle of Kansas. It’s got a bunch of cattle on the back, and they send the repair people. Oh, there wasn’t anything wrong with it. It’s a false alarm. And so the industries and manufacturing industries… It’s very expensive to shut down a manufacturing industry from the assets. But, oh, sorry, it was just a false alarm. And people who take their car in on a Saturday because of the red light, oh, there wasn’t really anything wrong. That’s the good news. You should be happy. It was just a false alarm. So to avoid the false alarms, they raise the thresholds. When now the system can be severely degraded before you get any alarm, and it’s in no way predictive.
So let me say. High -low thresholds are reactive. And so MSAT works fundamentally differently. It learns the patterns of correlation between and among all the signals. Some signals go up and down in unison. Some go up when others go down. It learns those patterns. And it detects an anomaly in the pattern days and often weeks before you’ll ever get near a threshold. So that’s the fundamental difference. So in. And.
do you play music? What do you play? Can you hear me? Okay. Do you play music? Huh? And you play chords? So it’s simple, right? When you’re playing chords and let’s say if you’re like me and you do a bad job at it, any untrained musician can even tell that I’m doing a bad job at it. Why? Those things need to go together. Independent notes? Maybe not. You cannot figure it out. But if you’re playing chords, anybody can say that guy sucks, right? So, same way. Understand, looking at the variations of a single variable can only take you down. But the multivariate part of MSAT where it looks at a joint number of sensors in one way, you can figure things out that are starting to go bad.
Misery loves company. Have you heard that? Similarly, anomalies don’t like to be alone. There’s always, they’re hiding amongst other anomalies. right?
Okay. Yeah. So I came in by accident but was really interested to hear what’s being discussed, especially MSET and the power of reducing the kind of compute and also translating into the CPU. It really was music to my ear. I have a question extended. What happens to the current ecosystem where plug and play and interoperability across the entire data engineering and RAG and MCPs and all is there? Is it possible to plug and play this thing? What I understand MSET is at a foundational and fundamental layer. So how does it merges with the current set of LLMs and services?
So we have to look at the problems we are trying to solve. Okay. And how do we build the correct architecture for that? The quick answer to that question is absolutely. I think we have… through sensor augmentation, we are augmenting a big field with sensors and we’ll certainly bring that and we’ll run that into techniques like the multivariate technique to come up with anomaly detection and predictive maintenance and etc. But after that, if you want to have a control system that is going and deploying the decision making, there will be other MCP based solutions that we develop. So it does all integrate. That’s the reason we have to closely look at the problem and make sure that for not all problems we are starting with and downloaded large language model.
I just have a follow up. So are there any open ecosystem where we can go and see and plug it into our current infrastructure on GenII services?
Yes, because remember, STEM practice company is an Oracle Corporation partner. So there is a lot that we do in the open source community and open integration. We can certainly spend time with you to educate you on all of that stuff. Any other questions? Hello everyone. So what is the most critical risk that policymakers, business, businesses and users are currently understanding about AI? Avi, you want to take that and then Anshu, you can go.
So it depends on the use case to use case, first of all, and at this point, country to country. Like if you think about ESG. You act is one of the first act has been released across the world. Similarly for data point of view DPDPI is coming into India so we have been following up all the policies which is being implemented and we are also thinking ahead of the time that okay this AI act will be soon coming into picture in India as well as in other countries also what are the things we need to make sure that it’s being following up properly that’s what we follow as a process
I mean if I am correct the question is what is the misunderstanding that policy makers make about AI sorry
No sir it’s not a misunderstanding for the policy maker actually DPDPI act is in 2023 also but after some long time it’s not enforcement and implemented on the current situation because some IT laws, I know this is existence in our country, but they are not sufficient for the upcoming and in future and presently cyber related crimes. And totally the AI loopholes because we can’t generate the accuracy of laws with only with the IT laws especially in old IT laws. DPDP, okay, it’s enacted but what is the enforcement date and when they come?
So the process basically in my understanding that it need to be forced by the government as an industry and need to come together government and industries as well as all the private entities need to come together. To make sure that is being forced upon. like if you think about I will give you an example very simple example right so if you think about iPhone chargers right it have a separate cable for lightning cable what we call but because of the EU and using by US policies maker they have mandate that it need to be C type charger right so these are the forces coming from the higher side so it need to be followed by that process but it does matter that when an organization start implementing those in a first way when government is releasing something definitely it can be followed up
hello good afternoon to all of you sir I am a master’s student in mathematics and I want to research in mathematics so as I seen there are advancement in AI so math is also integrated to machine learning so as I work on a project like the cancer detection technique 70 % are used for AI like neural network something like that so it’s research is also relevant in which direction are going on so research is also worth it in mathematics or something like that
by the way I am a math major so I think understanding math even though AI can solve math our understanding of math is very important to some extent understand AI right the closest we have come to understand AI is with formal reasoning so math is always a good background so we are doing research in fundamental understanding of what are the capabilities of LLM and reasoning about LLM and reasoning about it with your formal background is a very good research
greetings to all the panelists here I apologize I wasn’t here before so I couldn’t hear the conversation but as I can see the questions here I have one question that as far it relates with hallucinations as a legal background person I get often I just get the citations wrong and the case laws often get wrong so how far we can rely on the AI currently I know that the hallucination will evolve and the problem will be resolved eventually but at the current timeline how much can we rely on AI system and if it possible that in future that AI not like hallucinate it ever be a hallucinate free AI forever I hope I can make my question understood
okay yeah we go there, I just wanted Kevin to get that slide up. So we just, this is healthcare use case that we had the fortune of working with Tata on and we released it a few months ago and here we have 100 % accuracy. It’s not future. It’s now. We just do it differently. Right? And non -hallucinating methods are completely possible. With that, I’m going to let Ayush and Anshu address that topic as subject experts, but it’s not future. Demand for it first, because you are in a profession where you come up with some nonsense, the judge is going to throw you out of the room. Right? So you don’t have that luxury. Or in a doctor, he can end up much worse.
Right? So demand that first, but the solutions are here today. Right.
So thanks for the question, first of all. You know, to err is human. To err more is AI. so errors can always be there now what are the scopes of errors and how to reduce them so one you should have a proper understanding of things like the system should know about your context all things there then the thinking process should be auditable what are the sub steps that have been taken that should be auditable so as a user as a responsible user I can always see what are the reasons it got to that answer maybe it made a mistake in one of his thinking processes then accuracy like it’s very difficult to have a probabilistic system be 100 % accurate but it can still be 100 % reliable so maybe it is 95 % accurate but the 5 % times it is wrong we are able to tell 100 % of the times that this is probably wrong you need to double check or you need to have the expert involved in auditing this answer so 100 % reliability is definitely achievable we just need the right processes and thinkings and validations in place to make sure we can really trust the answer because it is really critical to take actions on
So, Anshu, if you can address some of the fundamentals about why these hallucinations happen and why domain -specific training avoids that.
So, let’s think about hallucination. So, prior system were non -hallucinating system and they were like search. By the way, humans hallucinate. If I ask two people to tell about the exact same incident, making them sit in different room, they will have different explanations. Right? So, human mind is fundamentally a hallucinating mind. In fact, LLMs, when they became LLMs is because we focused on prompt completion. And prompt completion comes from psychology, where psychology is our mind has a tendency to fill. And that is how you come with prompt completion and go beyond search. So, search is non -hallucinating and LLMs has to be hallucinating because it has to be intelligent and smart. So, again. Again, like this goes back to what Bernie was saying, biology.
Right? So, if you are like humans, you are like humans. And that also becomes to the answer, how do we increase the reliability on humans? Well, you train them, right? And you rely not just on one, but a multitude committee of experts. And then you do debates and discussions. You have multiple LLMs that debate with each other, right? These are the standard way. In fact, you can also mathematically, right? We have a student who is mathematics. You can mathematically show that if I have a way to reduce the probability of something by delta, then I can run that process in a cycle and keep reducing the probability of hallucination and reach near perfect hallucination free stuff.
But again, coming back to it, you have to do a lot of LLMs. That’s a lot of cost. Barrier is again the cost. Sorry.
Wonderful. Any other question? That side of the room. Okay. Any side of the room?
Hi everyone I am working in IT company and I should be loud hello everyone am I audible now clear and loud that’s the only tone I have I don’t know how to be loud ok is it fine now yeah I can’t be more louder ok so my question is yes AI and any technology you know as and when we grow it helps majority right AI is solving a problem and it will in future is going to solve a lot of problems giving us industrial solutions speeding up our you know software solutions that we are currently working on and I think that’s the main even helping us in a variety of the areas. My question is, for the students who are in a school, right, we do have chat, GPT, HMNI, all the AI tools there.
And so it’s very easy for the students who are in a school. You know, they can do their assignments in a minute or in a few seconds. So how is it helping the students? Do we have any, I don’t know whether it’s a correct question or not, but are there any steps taken by the government or taken by the, you know, great leaders of our country all over the world? How the students’ mindset, you know, we can, are there any, you know, obligations if we are applying to our students to not use such a tool? Because it’s free, of course, and it’s available, right, over the Internet. They can do their assignments in a minute.
So it’s, I don’t know. I think as for me, it’s basically for the, you know, college students, for the industry. For the employees, it’s helping us. But how is it helping the students? Because there has been no academic changes done by, I don’t know whether the school are doing any curriculum changes in their syllabus or not. So, yeah, any thoughts on that?
So, did you guys get the question? Because it’s a profound and important and deep question. Are we screwing up the children is what she’s asking, right? By allowing them to quickly come up with anything. So, love to hear your take. Why don’t you go first?
tasks from AI, otherwise you know, the same kind of journey we’ve had with calculators. Everyone knew how to multiply, divide, do many numbers until they started using calculator and now even for simple additions you go to the calculator. So one, it’s on personally us how much we start delegating to AI and lose touch of it. Then on second, all these educators, the pedagogies that form around the use of AI for education, the careers that start forming into it, they will themselves metamorphosize into what AI means in education space.
I mean this is a question that every university is asking and I think it’s a, as you said a profound question and I think the partial answer has already been said by Ayush, right? There are certain I mean skill set, right? If I want you to know addition, subtraction, you should not use calculators. But once you have gotten a basic feeling of that you the problem is not about using calculator the problem is what you can do with that calculator right so problem solving never goes away right you see what I am saying so imagine AI makes everybody 10x better then 10x better is the average and we will now aspire for something more so whatever is average is what AI can do you see what I am saying and going beyond that will require ingenuity creativity so I agree that education system need to transform and we are also learning as we go as to how we transform it but the goal is will always be can we solve problems that we cannot solve otherwise and that will require us to always think out of the box and so that is that will come I think it’s still an early stage but I think a lot of people are thinking about it talking about it and as I said it will start getting th
at’s a very profound question I have an 8 year old so I worry about that every day right but I hope I’m doing the right thing by letting him play with whatever AI he wants to play with but we are 10 minutes over time I’m told so I need to apologize for the next session but go ahead you had a question will you make that the last question good evening to everyone my que
stion is related to AGI because we have we are using AI right now so thinking of the next step I was thinking that is there any relation between AGI and quantum computer or is it like that AGI will be only possible after quantum computer or with the current processors
that’s a wonderful question but it’s a 2 hour topic question and I don’t know why you waited the last minute to ask that question we are launching our first quantum enablement center as a stem practice company in 2 sites in Chattanooga Tennessee and we hope that we can start and we hope that we can start launching quantum computers launching quantum computers and we hope that we can start and we hope that we can start and we hope that we can start in India as well, because we are thinking through specific problems that we can solve with quantum computers today and get it over there, but that is such a big topic, and we can’t, I think thinking about reducing competition needs, thinking about reducing cost is the way we are doing, but quantum computers don’t use a fraction of the energy compared to a similar GPU simulated machine that tries to simulate quantum processing, right?
So yes, there’s a lot of energy advantage to going that route, but that’s a very deep and very profound topic. Thanks for bringing that up, because we didn’t bring quantum up, but that’s a broad topic. With that, we have to close, because apparently we are stealing time from the next panel now, which is a difficult thing to do. I see a hand, I’ll just talk to you one -on -one outside, but thank you everybody for coming, I thank the panel. I think you guys got it and so by the way we are in we have a stall a booth whatever we call it it’s hall 6 stall 100 easy to remember 6 100 please come there and get our material and then get connected and we can keep the conversation going thank you very much thank you
Bernie Alen
Speech speed
152 words per minute
Speech length
4146 words
Speech time
1634 seconds
GPU Over‑provisioning & Ignored Software Optimization
Explanation
Bernie says organizations chase more GPUs out of fear of falling behind, while neglecting software‑level optimizations that could reduce hardware needs. He stresses that large‑scale software development normally includes heavy infrastructure optimization, which is being skipped in AI projects.
Evidence
“There is a software optimization step that everybody is skipping, that we would normally not skip in software development.” [3]. “We are just running around getting as many GPUs as possible because we’re all afraid that the other guy would get it and then we’ll be left out, right?” [4]. “For any large -scale software development, heavy amount of infrastructure optimization goes on.” [6].
Major discussion point
AI Infrastructure Cost and Optimization
Topics
Artificial intelligence | Environmental impacts
Sustainable AI Requires Reducing Hardware Demand
Explanation
Bernie argues that cutting model complexity and compute lowers the need for expensive, high‑heat GPU clusters, which in turn reduces power and water consumption associated with large AI farms.
Evidence
“And if you reduce the complexity and if you reduce the computation, then you don’t need to run a lot of these things on expensive, high heat generating, high failure rate, limited supply.” [32].
Major discussion point
Sustainability and Environmental Impact
Topics
Environmental impacts | Artificial intelligence
Policy Makers Must Enforce Emerging AI Regulations
Explanation
Bernie points out that policymakers often misunderstand AI regulation timelines and stresses the need for enforcement of acts like DPDP and the AI Act to ensure responsible AI use.
Evidence
“I mean if I am correct the question is what is the misunderstanding that policy makers make about AI sorry” [40]. “Similarly for data point of view DPDPI is coming into India so we have been following up all the policies which is being implemented and we are also thinking ahead of the time that okay this AI act will be soon coming into picture in India as well as in other countries also what are the things we need to make sure that it’s being following up properly that’s what we follow as a process” [41].
Major discussion point
Governance, Compliance, and Deterministic AI
Topics
Artificial intelligence | The enabling environment for digital development
Educate Customers on Optimization Methods Before Large‑scale Implementation
Explanation
Bernie emphasizes that before deploying AI solutions, users should be made aware of existing mathematical and software optimization techniques that can dramatically cut compute requirements.
Evidence
“And so certainly before you guys start implementing… And whatever AI method you’re using, whatever solution you’re going after, educate yourself on these kinds of methods that exist.” [59]. “Think about it in the AI space.” [62].
Major discussion point
Practical Deployment Process and Workflow Automation
Topics
Capacity development | Artificial intelligence
Anshumali Shrivastava
Speech speed
183 words per minute
Speech length
3031 words
Speech time
991 seconds
GPU Kernels Mismatch & Quadratic Scaling Problem
Explanation
Anshumali explains that GPUs are designed for dense matrix multiplication, yet many AI workloads grow quadratically, making linear GPU scaling ineffective and prompting the need for new mathematical approaches.
Evidence
“You are throwing linear number of GPUs to tackle something that goes quadratically.” [5]. “But remember, the fundamental kernel of GPUs were always built for full matrix multiplication.” [13].
Major discussion point
Scalability Limits and the Need for Larger Context Windows
Topics
Artificial intelligence | Environmental impacts
Model Parameter Growth Outpaces GPU Memory & Compute
Explanation
He illustrates that the exponential increase in LLM parameters far exceeds the logarithmic growth of GPU memory and compute, leading to a slowdown and inaccessibility of future models.
Evidence
“…what are these on the x axis is the year and on the y axis here is the parameter count of the LLMs now you see kind of two interpolated straight line the green ones is the amount of memory available in the GPUs … the red ones are the memory or the model parameter count for the demand … the rate of growth of hardware is nowhere close to the rate of growth of demand… models will get bigger, but they will not be able to cope up with the even the GPU growth, which means models will feel slower, the better models will feel slower and unaccessible, inachievable, right?” [12].
Major discussion point
Scalability Limits and the Need for Larger Context Windows
Topics
Artificial intelligence | Environmental impacts
Need for Efficient Algorithms to Enable Affordable Large‑scale Experimentation
Explanation
Anshumali stresses that without new efficient algorithms, the hard limits of GPU‑based infrastructure will prevent affordable experimentation at scale.
Evidence
“The challenge I’m going to talk about is sustainability of AI because that’s something that’s grown increasingly relevant as of late and as Anshu here said.” [75].
Major discussion point
Scalability Limits and the Need for Larger Context Windows
Topics
Artificial intelligence | Environmental impacts
Education & AI Literacy – Shift from Calculator Mindset
Explanation
He compares AI to calculators, arguing that education should move from rote computation to problem‑solving and creativity, leveraging AI as an augmenting tool rather than a crutch.
Evidence
“But once you have gotten a basic feeling of that you the problem is not about using calculator the problem is what you can do with that calculator right so problem solving never goes away… imagine AI makes everybody 10x better then 10x better is the average and we will now aspire for something more…” [66].
Major discussion point
Education, Societal Impact, and AI Literacy
Topics
Social and economic development | Capacity development
Kenny Gross
Speech speed
130 words per minute
Speech length
1616 words
Speech time
744 seconds
MSET Achieves Massive Compute Reduction
Explanation
Kenny describes the MSET technique as delivering three orders of magnitude lower compute costs, dramatically cutting energy use and enabling real‑time AI on CPUs.
Evidence
“So the energy savings come from the three orders of magnitude lower compute costs.” [26]. “We’ve done four presentations with NVIDIA GTC conferences demonstrating with real data the reduction in compute costs.” [34].
Major discussion point
AI Infrastructure Cost and Optimization
Topics
Artificial intelligence | Environmental impacts
Energy Savings & Downtime Avoidance via CPU‑based Telemetry
Explanation
He notes that the lower compute footprint of MSET not only saves energy but also prevents costly data‑center downtime caused by hardware failures.
Evidence
“This avoids downtime.” [27]. “And then the other aspect that I wanted to mention in terms of data centers is the prognostic for avoiding downtime in servers and chips, CPUs and GPUs.” [28].
Major discussion point
Sustainability and Environmental Impact
Topics
Environmental impacts | Artificial intelligence
Sensor‑based AI Governance – Low False‑Alarm Probability
Explanation
Kenny highlights that MSET provides the mathematically lowest false‑alarm and missed‑alarm probabilities, ensuring reliable early‑warning systems for critical assets.
Evidence
“MSET has the lowest mathematically possible false alarm and missed alarm probability for” [35]. “MSET 3 is capable of detecting all mechanisms that cause CPUs and GPUs to fail.” [37].
Major discussion point
Governance, Compliance, and Deterministic AI
Topics
Artificial intelligence | Building confidence and security in the use of ICTs
Lightweight CPU Telemetry Pipeline Enables Real‑time Prognostics
Explanation
He explains that MSET streams sensor data from assets into a lightweight CPU compute box, delivering real‑time prognostics without taxing GPU resources.
Evidence
“to stream the signals from any asset, airplane engines, any asset, autonomous vehicles, into the compute box that is lightweight on CPUs, not GPUs, that gives real -time prognostics with early warning of incipient problems, not a high -low threshold.” [23]. “And one spinoff of MSET for data center applications is called electronic prognostics, where we’re able to detect all mechanisms that lead to failures of chips and system boards in the data centers.” [31].
Major discussion point
Practical Deployment Process and Workflow Automation
Topics
Artificial intelligence | Capacity development
Abhideep Rastogi
Speech speed
153 words per minute
Speech length
1187 words
Speech time
462 seconds
Staged Deployment Roadmap
Explanation
Abhideep outlines a step‑by‑step AI deployment framework starting from defining the aim, mapping opportunities, assessing data, choosing architecture, piloting, governance, and finally production.
Evidence
“So we start with the stage zero, like what’s your aim of using AI?” [57]. “Once you have the data all your alignment is done then next stage comes into as a period of what’s your architecture strategy under that it’s a big umbrella like first we have to under architecture you have to finalize what’s your deployment strategy are you looking for a GPU are you looking for a CPU and then what type of deployment are you planning is it on premises if it is a hyperscaler and then once you finalize the deployment then you comes into model are you looking for SLM are you looking for LLM or what other things need to be done once your stage is done then where are you going to host the model into once your architecture is finalized then your computer also will come into picture what’s your computer strategy you are looking for are you going to run on virtual CPUs or is it something that you can run in your local system also So, depend use case to use cases, right?” [55].
Major discussion point
Practical Deployment Process and Workflow Automation
Topics
Artificial intelligence | Capacity development
Governance & Privacy Regulations
Explanation
He stresses that AI deployments must comply with emerging data protection laws such as DPDP and the AI Act, establishing clear rules and guardrails.
Evidence
“Similarly for data point of view DPDPI is coming into India so we have been following up all the policies which is being implemented and we are also thinking ahead of the time that okay this AI act will be soon coming into picture in India as well as in other countries also what are the things we need to make sure that it’s being following up properly that’s what we follow as a process” [41]. “So there are a couple of rules we need to apply.” [52].
Major discussion point
Governance, Compliance, and Deterministic AI
Topics
Artificial intelligence | The enabling environment for digital development
Ayush Gupta
Speech speed
166 words per minute
Speech length
1364 words
Speech time
490 seconds
Smaller Language Models (SLMs) for Cheap Inference
Explanation
Ayush notes that training and deploying SLMs dramatically reduces GPU dependence and cost, enabling affordable inference for enterprise use‑cases.
Evidence
“We are also one of the companies training SLMs to power this use case.” [20]. “Like how do you have cheaper inference and that is where I’m excited about what you guys are doing at STEM.” [17].
Major discussion point
AI Infrastructure Cost and Optimization
Topics
Artificial intelligence | Financial mechanisms
Agentic Data‑Analysis Platform Replaces Static Dashboards
Explanation
He describes an agentic platform that lets business users converse with data, delivering proactive insights with lower inference cost compared to traditional static reporting.
Evidence
“we are in the space of unifying the entire data universe of an enterprise and providing an agentic data analysis platform so what that means is a normal business user who so far was used to just static dashboards can come on a system and have conversations get proactive insights and do better decision making faster” [63]. “Like how do you have cheaper inference and that is where I’m excited about what you guys are doing at STEM.” [17].
Major discussion point
Practical Deployment Process and Workflow Automation
Topics
Artificial intelligence | Capacity development
Cheaper Inference Architectures Support Greener AI
Explanation
Ayush argues that alternative, CPU‑friendly architectures lower inference costs, which translates into reduced energy consumption and a smaller environmental footprint.
Evidence
“So the exciting thing about us is can we have an alternate architecture that scales and gives us a very cheap cost of inference so that we can give the same technology at a much scalable use.” [19]. “Like how do you have cheaper inference and that is where I’m excited about what you guys are doing at STEM.” [17].
Major discussion point
Sustainability and Environmental Impact
Topics
Environmental impacts | Artificial intelligence
Enterprise AI Must Respect Data Sovereignty & Provide Deterministic Outputs
Explanation
Ayush emphasizes that AI solutions for enterprises need to honor data‑sovereignty laws (GDPR/DPDP) and deliver repeatable, audit‑ready results.
Evidence
“Similarly for data point of view DPDPI is coming into India so we have been following up all the policies which is being implemented and we are also thinking ahead of the time that okay this AI act will be soon coming into picture in India as well as in other countries also what are the things we need to make sure that it’s being following up properly that’s what we follow as a process” [41].
Major discussion point
Governance, Compliance, and Deterministic AI
Topics
Artificial intelligence | Human rights and the ethical dimensions of the information society
Kevin Zane
Speech speed
143 words per minute
Speech length
385 words
Speech time
160 seconds
High Compute Cost Limits Experimentation; Efficiency Is Essential
Explanation
Kevin points out that the steep compute and energy demands of GPU‑centric AI create a hard scalability limit, making trial‑and‑error research prohibitively expensive.
Evidence
“Well, the challenge I’m going to talk about now is the sustainability of AI because, well, as Anshu here said, we’re rapidly approaching a hard limit on how scalable GPU -based infrastructure is and with the very large impact on the environment, on water and power and the amount that is required to fuel these GPU server stacks…” [58]. “The challenge I’m going to talk about is sustainability of AI because that’s something that’s grown increasingly relevant as of late and as Anshu here said.” [75].
Major discussion point
AI Infrastructure Cost and Optimization
Topics
Artificial intelligence | Environmental impacts
Sustainable AI Depends on Algorithmic Efficiency
Explanation
He stresses that improving algorithms, not just hardware, is the key to reducing AI’s environmental footprint.
Evidence
“Well, the challenge I’m going to talk about now is the sustainability of AI because, well, as Anshu here said, we’re rapidly approaching a hard limit on how scalable GPU -based infrastructure is and with the very large impact on the environment, on water and power…” [58].
Major discussion point
Sustainability and Environmental Impact
Topics
Environmental impacts | Artificial intelligence
Deterministic AI Reduces Hallucinations & Enables Auditability
Explanation
Kevin argues that deterministic outputs make AI results repeatable, facilitating auditing and lowering the risk of hallucinated answers.
Evidence
“The challenge I’m going to talk about now is the sustainability of AI because, well, as Anshu here said…” [58]. “The challenge I’m going to talk about is sustainability of AI because that’s something that’s grown increasingly relevant as of late and as Anshu here said.” [75].
Major discussion point
Hallucinations, Reliability, and Accuracy
Topics
Artificial intelligence | Human rights and the ethical dimensions of the information society
Participant
Speech speed
138 words per minute
Speech length
925 words
Speech time
402 seconds
Plug‑and‑play Integration of New Methods with Existing LLM Services
Explanation
The participant expresses interest in how methods like MSET can be translated to CPU‑based workloads and integrated with current LLM offerings in a problem‑driven way.
Evidence
“So I came in by accident but was really interested to hear what’s being discussed, especially MSET and the power of reducing the kind of compute and also translating into the CPU.” [29].
Major discussion point
Scalability Limits and the Need for Larger Context Windows
Topics
Artificial intelligence | Capacity development
AI Tools Challenge Academic Integrity
Explanation
The participant raises concerns about students using AI tools like ChatGPT, likening it to the calculator era and questioning policy responses.
Evidence
“My question is, for the students who are in a school, right, we do have chat, GPT, HMNI, all the AI tools there.” [65]. “How the students’ mindset, you know, we can, are there any, you know, obligations if we are applying to our students to not use such a tool?” [73].
Major discussion point
Education, Societal Impact, and AI Literacy
Topics
Social and economic development | Capacity development
Agreements
Agreement points
Current AI infrastructure is unsustainable and creates significant cost and environmental problems
Speakers
– Bernie Alen
– Anshumali Shrivastava
– Kenny Gross
– Kevin Zane
Arguments
GPU-based infrastructure creates expensive, high heat generating, high failure rate systems with limited supply
The rate of growth of hardware capabilities is nowhere close to the rate of growth of AI model parameter demands
MSET technology achieves three orders of magnitude lower compute costs compared to conventional approaches
Current AI infrastructure expansion is causing significant environmental harm through excessive power and water consumption
Summary
All speakers agree that the current GPU-centric approach to AI is creating unsustainable costs, environmental damage, and performance limitations that cannot be solved by simply adding more hardware
Topics
Artificial intelligence | Environmental impacts | Financial mechanisms
Alternative mathematical and algorithmic approaches can dramatically reduce computational requirements
Speakers
– Bernie Alen
– Anshumali Shrivastava
– Kenny Gross
Arguments
Mathematical optimization methods that have existed for decades can reduce computation needs and enable AI to run on CPUs, edge computers, and mobile devices
New mathematical approaches to attention can enable longer context windows more efficiently than current methods
MSET learns correlation patterns between multiple signals and detects anomalies days or weeks before threshold-based systems
Summary
There is strong consensus that established mathematical techniques and new algorithmic innovations can provide superior performance while dramatically reducing computational costs
Topics
Artificial intelligence | The enabling environment for digital development
Enterprise AI requires context-specific solutions that general models cannot provide
Speakers
– Ayush Gupta
– Abhideep Rastogi
Arguments
Enterprise AI requires understanding of specific business context, processes, and KPIs that general models like ChatGPT cannot provide
Multi-stage transformation process from identifying AI objectives to production deployment with proper governance frameworks
Summary
Both speakers agree that successful enterprise AI implementation requires deep understanding of specific business contexts and systematic transformation processes, not just general-purpose AI tools
Topics
Artificial intelligence | The digital economy | Social and economic development
Reliability and deterministic behavior are essential for production AI systems
Speakers
– Kenny Gross
– Kevin Zane
– Ayush Gupta
Arguments
Traditional high-low threshold monitoring is reactive and prone to false alarms
Deterministic AI approaches are necessary for production environments where hallucinations and false data cannot be tolerated
100% reliability is achievable through proper processes, validation, and auditable thinking steps, even if accuracy is not perfect
Summary
All three speakers emphasize that production AI systems must be reliable, predictable, and auditable, particularly in critical applications where errors have serious consequences
Topics
Artificial intelligence | Building confidence and security in the use of ICTs
India’s scale and cost constraints create opportunities for developing globally applicable solutions
Speakers
– Bernie Alen
– Ayush Gupta
Arguments
India presents the toughest scalability challenges with massive population, extensive mobile device penetration, and cost constraints
Indian market requires solutions that cost approximately 1 rupee per conversation compared to $1 in the US market
Summary
Both speakers see India’s demanding requirements as a forcing function that will drive innovation in efficient, scalable AI solutions that can then be applied globally
Topics
Closing all digital divides | The digital economy | Social and economic development
Similar viewpoints
Both emphasize that mathematical and algorithmic innovations are urgently needed to make AI development sustainable and accessible for experimentation
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
Mathematical optimization can provide sustainable solutions before environmental damage becomes irreversible
Trial and error experimentation is limited by the high cost of current AI infrastructure
Topics
Artificial intelligence | Environmental impacts | Capacity development
Both stress the importance of data governance, compliance, and systematic approaches to enterprise AI implementation
Speakers
– Ayush Gupta
– Abhideep Rastogi
Arguments
Data sovereignty and compliance requirements prevent enterprises from connecting sensitive data to public AI services
Multi-stage transformation process from identifying AI objectives to production deployment with proper governance frameworks
Topics
Data governance | Artificial intelligence | Human rights and the ethical dimensions of the information society
Both advocate for intelligent monitoring and algorithmic solutions that can improve system reliability while reducing resource consumption
Speakers
– Kenny Gross
– Kevin Zane
Arguments
Electronic prognostics can detect all mechanisms leading to CPU and GPU failures in data centers, avoiding costly downtime
Better algorithms can increase AI efficiency and speed without requiring massive power consumption
Topics
Artificial intelligence | Building confidence and security in the use of ICTs | Environmental impacts
Unexpected consensus
Non-hallucinating AI is achievable today, not a future aspiration
Speakers
– Bernie Alen
– Ayush Gupta
Arguments
Non-hallucinating AI methods are possible today using different approaches than probabilistic language models
100% reliability is achievable through proper processes, validation, and auditable thinking steps, even if accuracy is not perfect
Explanation
This consensus is unexpected because it challenges the common assumption that hallucinations are an inevitable characteristic of current AI systems. Both speakers assert that reliable, non-hallucinating AI is currently achievable through different methodological approaches
Topics
Artificial intelligence | Building confidence and security in the use of ICTs
Academic mathematics research remains crucial for AI advancement
Speakers
– Anshumali Shrivastava
– Participant
Arguments
New mathematical approaches to attention can enable longer context windows more efficiently than current methods
Mathematics research remains relevant and important for understanding AI capabilities and formal reasoning
Explanation
This consensus is unexpected given concerns that AI might make mathematical research obsolete. Instead, both emphasize that mathematical understanding is more important than ever for advancing AI capabilities
Topics
Artificial intelligence | Capacity development
CPU-based solutions can outperform GPU-based systems at scale
Speakers
– Anshumali Shrivastava
– Bernie Alen
Arguments
Beyond 131,000 context window, CPU-based solutions with new algorithms can outperform GPU-based systems
Real-world demonstration showed 100% accuracy without using any GPUs in proposed infrastructure
Explanation
This consensus challenges the dominant narrative that GPUs are essential for high-performance AI. Both speakers provide evidence that algorithmic improvements can make CPU-based solutions superior for certain applications
Topics
Artificial intelligence | The enabling environment for digital development
Overall assessment
Summary
There is remarkably strong consensus among speakers that current AI development approaches are fundamentally unsustainable and that alternative mathematical and algorithmic solutions can provide superior performance at dramatically lower costs. All speakers agree on the need for more reliable, deterministic AI systems for production use, and there is broad agreement that enterprise AI requires context-specific solutions rather than general-purpose models.
Consensus level
High level of consensus with significant implications for AI development strategy. The agreement spans technical, economic, and environmental dimensions, suggesting a paradigm shift away from hardware-intensive approaches toward algorithm-intensive solutions. This consensus could drive major changes in AI infrastructure investment and development priorities, particularly in emerging markets where cost constraints are most severe.
Differences
Different viewpoints
Fundamental approach to AI computation – GPU-based vs CPU-based solutions
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
GPU-based infrastructure creates expensive, high heat generating, high failure rate systems with limited supply
Beyond 131,000 context window, CPU-based solutions with new algorithms can outperform GPU-based systems
Summary
While both speakers critique current GPU-centric approaches, they differ on solutions. Bernie advocates for moving away from GPUs entirely to CPUs/edge devices through software optimization, while Anshumali proposes a hybrid approach where CPUs can outperform GPUs only at very large context windows (beyond 131K tokens) using new mathematical methods.
Topics
Artificial intelligence | The enabling environment for digital development | Environmental impacts
Nature of AI hallucinations and their solutions
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
Non-hallucinating AI methods are possible today using different approaches than probabilistic language models
LLMs has to be hallucinating because it has to be intelligent and smart
Summary
Bernie claims that hallucination-free AI is achievable now through deterministic approaches, while Anshumali argues that hallucinations are inherent to intelligent systems (like human minds) and that LLMs must hallucinate to be truly intelligent. Anshumali sees hallucinations as a feature, not a bug, that enables intelligence beyond simple search.
Topics
Artificial intelligence | Building confidence and security in the use of ICTs
Scope and applicability of alternative AI methods
Speakers
– Bernie Alen
– Kenny Gross
Arguments
Mathematical optimization methods that have existed for decades can reduce computation needs and enable AI to run on CPUs, edge computers, and mobile devices
MSET learns correlation patterns between multiple signals and detects anomalies days or weeks before threshold-based systems
Summary
While both advocate for alternative approaches, Bernie presents a broad solution applicable to general AI applications, while Kenny focuses specifically on sensor-based anomaly detection and prognostics. Kenny’s MSET is domain-specific for multivariate signal analysis, whereas Bernie claims broader applicability across AI applications.
Topics
Artificial intelligence | The enabling environment for digital development
Unexpected differences
Role of hallucinations in AI intelligence
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
Non-hallucinating AI methods are possible today using different approaches than probabilistic language models
LLMs has to be hallucinating because it has to be intelligent and smart
Explanation
This disagreement is unexpected because both speakers are advocating for better AI systems, yet they have fundamentally opposite views on whether hallucinations are a bug to be eliminated or a necessary feature of intelligence. Anshumali’s position that hallucinations are essential for intelligence challenges the common assumption that reliability requires eliminating hallucinations.
Topics
Artificial intelligence | Building confidence and security in the use of ICTs
Immediacy of solutions vs future development
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
Real-world demonstration showed 100% accuracy without using any GPUs in proposed infrastructure
New mathematical approaches to attention can enable longer context windows more efficiently than current methods
Explanation
Unexpectedly, Bernie claims current availability of complete solutions while Anshumali, despite being more academically oriented, focuses on incremental improvements to existing systems. This reverses the typical expectation that industry would be more incremental while academia would claim breakthrough solutions.
Topics
Artificial intelligence | The enabling environment for digital development
Overall assessment
Summary
The main disagreements center on fundamental approaches to AI computation (complete departure from GPUs vs selective algorithmic improvements), the nature of AI reliability (elimination vs management of hallucinations), and the scope of alternative solutions (broad applicability vs domain-specific applications). Despite shared concerns about sustainability and costs, speakers differ significantly on implementation strategies.
Disagreement level
Moderate to high disagreement on technical approaches and philosophical foundations, but strong consensus on problem identification. The disagreements have significant implications as they represent different paradigms for AI development – revolutionary vs evolutionary approaches, deterministic vs probabilistic systems, and immediate vs future solutions. These differences could lead to divergent research and development paths in the AI field.
Partial agreements
Partial agreements
All speakers agree that current AI infrastructure is environmentally unsustainable and that hardware growth cannot keep pace with AI demands. However, they disagree on solutions: Bernie advocates for complete shift to optimized software on CPUs, Anshumali proposes new mathematical approaches for specific use cases, and Kevin supports algorithmic improvements while maintaining some infrastructure.
Speakers
– Bernie Alen
– Anshumali Shrivastava
– Kevin Zane
Arguments
The rapid pace of AI development is creating unsustainable infrastructure demands that harm the planet
The rate of growth of hardware capabilities is nowhere close to the rate of growth of AI model parameter demands
Current AI infrastructure expansion is causing significant environmental harm through excessive power and water consumption
Topics
Environmental impacts | Artificial intelligence
Both agree that enterprise AI deployment requires systematic approaches and cannot rely on general-purpose models like ChatGPT. However, Ayush focuses on the technical limitations and data sovereignty issues, while Abhideep emphasizes the organizational transformation process and governance frameworks.
Speakers
– Ayush Gupta
– Abhideep Rastogi
Arguments
Enterprise AI requires understanding of specific business context, processes, and KPIs that general models like ChatGPT cannot provide
Multi-stage transformation process from identifying AI objectives to production deployment with proper governance frameworks
Topics
Artificial intelligence | The digital economy | Data governance
Both recognize India’s unique cost constraints and scale challenges as drivers for innovation. However, Ayush focuses on market pricing pressures and business viability, while Bernie sees India as a testing ground for developing globally applicable solutions.
Speakers
– Ayush Gupta
– Bernie Alen
Arguments
Indian market requires solutions that cost approximately 1 rupee per conversation compared to $1 in the US market
India presents the toughest scalability challenges with massive population, extensive mobile device penetration, and cost constraints
Topics
Closing all digital divides | The digital economy | Financial mechanisms
Similar viewpoints
Both emphasize that mathematical and algorithmic innovations are urgently needed to make AI development sustainable and accessible for experimentation
Speakers
– Bernie Alen
– Anshumali Shrivastava
Arguments
Mathematical optimization can provide sustainable solutions before environmental damage becomes irreversible
Trial and error experimentation is limited by the high cost of current AI infrastructure
Topics
Artificial intelligence | Environmental impacts | Capacity development
Both stress the importance of data governance, compliance, and systematic approaches to enterprise AI implementation
Speakers
– Ayush Gupta
– Abhideep Rastogi
Arguments
Data sovereignty and compliance requirements prevent enterprises from connecting sensitive data to public AI services
Multi-stage transformation process from identifying AI objectives to production deployment with proper governance frameworks
Topics
Data governance | Artificial intelligence | Human rights and the ethical dimensions of the information society
Both advocate for intelligent monitoring and algorithmic solutions that can improve system reliability while reducing resource consumption
Speakers
– Kenny Gross
– Kevin Zane
Arguments
Electronic prognostics can detect all mechanisms leading to CPU and GPU failures in data centers, avoiding costly downtime
Better algorithms can increase AI efficiency and speed without requiring massive power consumption
Topics
Artificial intelligence | Building confidence and security in the use of ICTs | Environmental impacts
Takeaways
Key takeaways
Current AI development is skipping critical software optimization steps that could dramatically reduce infrastructure costs and environmental impact
Mathematical optimization methods that have existed for decades can enable AI to run efficiently on CPUs, edge devices, and mobile phones instead of expensive GPU clusters
Real-world demonstrations show 100% accuracy is achievable without GPUs, with cost reductions of up to 2,500 times compared to traditional methods
The AI industry is approaching hard limits due to the ‘AI memory wall’ – hardware growth cannot keep pace with model parameter demands
Context window limitations are plateauing and represent the next major challenge for complex AI automation and reasoning
New mathematical approaches to attention mechanisms can outperform GPU-based systems for large context windows using CPU-based solutions
Enterprise AI requires domain-specific solutions with proper governance, data sovereignty, and deterministic behavior rather than general-purpose models like ChatGPT
India’s market constraints (massive scale, cost requirements of ~1 rupee per conversation) will drive the development of globally applicable efficient AI solutions
Sustainability concerns are critical – current AI infrastructure expansion is causing significant environmental harm through excessive power and water consumption
Non-hallucinating AI methods are available today using different approaches than probabilistic language models, achieving 100% reliability through proper validation processes
Resolutions and action items
STEM Practice Company to continue demonstrating optimization methods through Oracle partnership and real-world use cases
Panelists committed to educating the market about existing mathematical optimization methods before environmental damage becomes irreversible
Open invitation extended for companies to participate in ‘blind bake-offs’ using their own data to compare MSET against current techniques
STEM Practice Company planning to launch quantum enablement centers in Chattanooga, Tennessee and potentially India
Continued collaboration between academia (Rice University), industry (Tata Group, Genloop), and technology partners (Oracle) to advance efficient AI methods
Audience encouraged to visit booth (Hall 6, Stall 100) for materials and continued conversation
Unresolved issues
How to effectively transform educational systems to adapt to AI tools while maintaining critical thinking and problem-solving skills
Specific timeline and enforcement mechanisms for AI governance frameworks like DPDP Act in India
The relationship between AGI development and quantum computing requirements – identified as a complex 2-hour topic requiring separate discussion
Detailed technical implementation of integrating MSET and other optimization methods with existing LLM and RAG ecosystems
Balancing the trade-offs between AI capability advancement and environmental sustainability on a global scale
Establishing industry-wide standards for deterministic AI in critical applications like healthcare and legal services
Suggested compromises
Use block sparsity and mixture of experts as intermediate solutions while developing more fundamental algorithmic improvements
Implement hybrid approaches that combine GPU efficiency for smaller context windows with CPU-based solutions for larger contexts
Develop AI systems with 95% accuracy but 100% reliability through proper validation and auditing processes
Allow controlled AI experimentation in education while maintaining core skill development in fundamental areas
Balance between using existing AI tools for productivity while investing in more efficient underlying technologies
Adopt staged enterprise AI transformation processes that include proper governance frameworks from the beginning rather than retrofitting them later
Thought provoking comments
We are not asking the questions that we would normally ask in any project of this scale… There is a software optimization step that everybody is skipping, that we would normally not skip in software development.
Speaker
Bernie Alen
Reason
This comment reframes the entire AI infrastructure discussion by suggesting the industry has abandoned fundamental engineering principles due to AI hype. It challenges the assumption that expensive GPU infrastructure is necessary and introduces the core thesis that mathematical optimization methods have been overlooked.
Impact
This opening statement set the entire tone and direction of the panel, establishing the central argument that the AI industry is making costly mistakes by not applying traditional optimization methods. It provided the foundation for all subsequent technical discussions about alternatives to GPU-heavy infrastructure.
The rate of growth of hardware is nowhere close to the rate of growth of demand… models will get bigger, but they will not be able to cope up with the even the GPU growth, which means models will feel slower, the better models will feel slower and unaccessible.
Speaker
Anshumali Shrivastava
Reason
This insight uses concrete data to demonstrate that the current scaling approach is fundamentally unsustainable. By showing the diverging exponential curves of hardware capability versus model demands, it provides mathematical proof that the industry is heading toward a crisis.
Impact
This comment shifted the discussion from theoretical optimization benefits to urgent necessity. It established that the panel’s solutions aren’t just cost-saving measures but essential for the future viability of AI development, adding urgency to the conversation.
I believe right now the biggest complaint in enterprises are agents do not have common sense. They hallucinate… To go from 50, 60 to 99, you need that constant… and that will happen when we will have really long context.
Speaker
Anshumali Shrivastava
Reason
This comment connects technical limitations (context windows) to real-world business problems (unreliable AI agents). It provides a clear pathway from current AI limitations to practical enterprise value, making the technical discussion relevant to business stakeholders.
Impact
This insight bridged the gap between technical optimization and business value, helping other panelists connect their solutions to enterprise needs. It influenced subsequent discussions about governance, reliability, and practical AI deployment challenges.
In India the cost has to come down even further. Like it has to be probably 1 rupee per conversation to actually unlock the same quality of insights… the major cost driver is the GPU.
Speaker
Ayush Gupta
Reason
This comment introduces the critical constraint of economic accessibility in emerging markets, showing that current AI costs create a digital divide. It demonstrates how cost optimization isn’t just about efficiency but about global AI accessibility and democratization.
Impact
This perspective added a global equity dimension to the technical discussion, reinforcing Bernie’s earlier point about India being the testing ground for scalable solutions. It helped frame the panel’s solutions as not just cost-saving but democratizing technologies.
Everything at the end of the day boils down to efficiency… all interesting problems will be solved if you are allowed enough trial and error… the biggest hurdle in the advancement is the ease at which I can trial and error and experiment.
Speaker
Anshumali Shrivastava
Reason
This insight reframes AI development challenges as fundamentally about experimentation costs rather than technical limitations. It suggests that high infrastructure costs are actually limiting AI innovation by making experimentation prohibitively expensive.
Impact
This comment provided a unifying theory for why cost optimization matters beyond just saving money – it’s about enabling innovation itself. It helped tie together various technical solutions discussed by showing how they all serve the broader goal of making AI experimentation accessible.
We are taking that water and that power and the planet from you… We are taking… By using this expensive infrastructure… We need to be very careful about what we are doing to the planet.
Speaker
Bernie Alen
Reason
This comment introduces environmental sustainability as a moral imperative, not just a technical consideration. It personalizes the environmental impact by directly addressing the younger generation, adding ethical weight to the technical optimization discussion.
Impact
This shifted the conversation from purely technical and economic considerations to include environmental responsibility. It elevated the panel’s solutions from business optimizations to moral imperatives, particularly resonating with the student participants.
So, to err is human. To err more is AI… 100% reliability is definitely achievable… maybe it is 95% accurate but the 5% times it is wrong we are able to tell 100% of the times that this is probably wrong.
Speaker
Ayush Gupta
Reason
This comment provides a nuanced solution to the hallucination problem by distinguishing between accuracy and reliability. It offers a practical framework for making AI systems trustworthy even when they’re not perfect, which is crucial for enterprise adoption.
Impact
This insight helped resolve the tension between AI’s inherent probabilistic nature and enterprise needs for reliability. It influenced the discussion toward practical governance solutions and helped address concerns from participants in high-stakes professions like law and medicine.
Overall assessment
These key comments collectively transformed what could have been a narrow technical discussion about GPU alternatives into a comprehensive examination of AI’s sustainability crisis. The most impactful insights connected technical optimization to broader themes of global accessibility, environmental responsibility, and innovation democratization. Bernie’s opening challenge to industry assumptions set the stage, while Anshumali’s data-driven arguments about unsustainable scaling trends created urgency. The discussion evolved from ‘how to save money’ to ‘how to save AI development itself’ through the recognition that current approaches limit experimentation, create global inequities, and harm the environment. The panelists’ ability to connect technical solutions to these broader implications made the conversation relevant to diverse stakeholders – from students concerned about environmental impact to enterprise users needing reliable AI systems to researchers needing affordable experimentation platforms.
Follow-up questions
How can we create 100 million context window capabilities faster than current approaches?
Speaker
Anshumali Shrivastava
Explanation
This is critical for enabling complex automation and agentic workflows that require extensive reasoning across large amounts of information, which current quadratic scaling limitations prevent.
How can we achieve inference costs as low as 1 rupee per conversation for Indian market scalability?
Speaker
Ayush Gupta
Explanation
This cost reduction is essential for making AI solutions viable at massive scale in price-sensitive markets like India, where current GPU-based inference costs are prohibitive.
What are the specific integration pathways for MSET with current LLM and MCP-based ecosystems?
Speaker
Participant
Explanation
Understanding how foundational optimization techniques like MSET can be integrated with existing AI infrastructure is crucial for practical implementation.
How should educational curricula be transformed to address AI’s impact on student learning and skill development?
Speaker
Participant
Explanation
This addresses the fundamental concern about whether AI tools are helping or hindering student development and what pedagogical changes are needed.
What is the relationship between AGI development and quantum computing capabilities?
Speaker
Participant
Explanation
This explores whether quantum computing is necessary for achieving AGI or if current processors can support AGI development.
How can we develop open ecosystem platforms for plugging optimization techniques into current GenAI services?
Speaker
Participant
Explanation
This addresses the practical need for interoperability and plug-and-play solutions that can integrate advanced optimization methods with existing AI infrastructure.
What specific governance frameworks are needed for enterprise AI deployment beyond current compliance requirements?
Speaker
Multiple participants (Abhideep Rastogi, Ayush Gupta)
Explanation
Current regulations like GDPR and DPDP are insufficient for addressing the complex governance challenges posed by AI systems in enterprise environments.
How can we achieve 100% reliability in AI systems even when accuracy is less than perfect?
Speaker
Ayush Gupta
Explanation
This addresses the critical need for AI systems that can reliably identify when they might be wrong, especially in high-stakes applications like legal and medical fields.
What are the mathematical foundations for reducing hallucination probability to near-zero through iterative processes?
Speaker
Anshumali Shrivastava
Explanation
Understanding the mathematical principles behind reducing AI hallucinations is crucial for developing more reliable AI systems for critical applications.
How can quantum computing be practically implemented for specific AI problems today rather than waiting for future breakthroughs?
Speaker
Bernie Alen
Explanation
This explores immediate practical applications of quantum computing for AI rather than theoretical future possibilities, focusing on energy efficiency and computational advantages.
Disclaimer: This is not an official session record. DiploAI generates these resources from audiovisual recordings, and they are presented as-is, including potential errors. Due to logistical challenges, such as discrepancies in audio/video or transcripts, names may be misspelled. We strive for accuracy to the best of our ability.
Related event

