Tamper-Proofing Data to Accelerate AI Adoption

How Businesses Can Drive Growth and Improve Governance: Q&A with Henry Guo, VP of Product Management at Casper Labs

Data is growing at an unprecedented rate. By 2025, its global volume is projected to reach 175 zettabytes. To put it into perspective, that’s 175 billion USB sticks worth of information. It’s also a potential treasure trove for businesses, and many are hoping Artificial Intelligence (AI) will accelerate the rate of return on available data. But with the introduction of the EU’s AI Act, as well as intensifying scrutiny around how the private sector uses sensitive data, companies today are balancing the promise of data growth with its pitfalls. In short, they must ensure security and compliance in concert with innovation.

We recently sat down with Henry Guo, Casper Labs’ new VP of AI Product Management, to talk about why he believes blockchain is the linchpin to improving data quality, enhancing safety and transparency across AI deployments, while also unlocking new opportunities for AI innovation.

Q: You’ve worked on a number of AI projects over the years. In your opinion, what’s holding companies back from adopting the technology today?

A lot of my work has been about building customer confidence in AI. Walking them through how an AI model works, how it learns, and where it sources its data is a huge part of that. While I was at oPRO.ai, this was a main focus. We even created a front-end user interface that enabled any kind of engineer, regardless of expertise, to easily understand the product’s algorithm and interact with it.

I can’t emphasize enough how important trust is when it comes to AI. In particular, generative AI is a new paradigm shift and a rapidly evolving technology that people aren’t entirely comfortable with using just yet. The amount of data being processed by these models is extraordinary. OpenAI came out with ChatGPT in 2022, which helped kick off the generative AI boom. Now, large language models (LLMs) easily exceed 100 billion parameters—and they’re still growing. These are massive datasets. Everyone is excited about these developments, but there’s going to be a learning curve. How are we going to make sure all of that data is secure and “good” data? That uncertainty is what’s holding companies back from going all in with AI. AI can be a force for good if we can make it accessible, safe, and trustworthy for both AI developers and end users of all stripes.

Q: Your background is in data security, which dovetails nicely with these questions companies have about the promise of AI today. Tell us a bit more about your work before joining Casper Labs.

I would consider myself a computer scientist by education and a technologist by training who loves building products around cutting-edge tech. I worked in the consulting space for a few years after graduating from Carnegie Mellon, and then joined Cisco Systems right after my MBA from Northwestern University. That’s where I used my combined expertise in computer science and corporate strategy to help companies understand and mitigate their cybersecurity risk vectors.

When I went in-house at Cisco in 2013, I worked closely with their security software business unit. And then at VMWare, my responsibilities revolved around cloud management. Our software at VMware managed the optimization and orchestration of workloads and data across multiple clouds both on-premise and in the public cloud. It was an exciting time as businesses were just starting to turn their attention to the quality of their cloud management systems and how data can help them make better technology decisions in an automated fashion.

Q: When did you start focusing specifically on AI?

Around that time, new questions were popping up about the kind of processes that companies needed in order to decipher their data, and how to convert information into insights to fuel smarter business decisions. I realized that better data governance and automation was imperative for businesses to stay competitive. Everyone was trying to make sense of their growing data—and figure out how to build their IT infrastructures to be safer, more efficient, and more compliant.

This growing need for better decision making with data propelled me on to my journey with AI. I transitioned from large Fortune 500 technology companies like Cisco Systems and VMware to AI startups in the fall of 2018. I worked with a motley crew of brilliant professors, AI scientists and practitioners, PhD researchers, and industry specialists from established technology companies like Google, VMware, and Microsoft. We worked together to build revolutionary AI software and infrastructure platforms so that we could use unstructured data and transform it into highly useful AI models for a wide range of applications, from next-generation autonomous industrial manufacturing, to medical image detection, to natural language summarizations. From these experiences, I knew that we needed to find a better, more intelligent way to work with data to create responsible AI models.

Q: What was your first encounter with blockchain technology?

Back in 2011, I read Bitcoin’s original plan and protocol. I had a solid understanding of what it was, since my master’s degree in computer science focused on cryptography and cybersecurity. I wasn’t particularly interested in the idea of trading cryptocurrencies, but the underlying digital ledger technology—blockchain—was something that caught my attention. I was drawn to this idea of the fundamental technology being able to automate and tamper-proof swaths of data on a decentralized network.

Q: True or false: Blockchain can be an augmentative technology for AI.

True.

AI models have to deal with massive volumes of structured and unstructured data in various formats (text, image, video, and audio, among others), and once a model is trained, there’s the added challenge of ensuring the models and the data streams haven't been tampered with and corrupted.

Throughout my time working with experts in the AI space, “garbage in, garbage out” was a recurring adage. Cleaning and preparing the data actually plays a significant role when it comes to using AI models. So, if AI practitioners choose to use distorted data, fail to incorporate the right guardrails, or don’t regularly tune their models, it can easily lead to bad results.

That’s where blockchain comes in. Automating AI guardrails, data governance, and security-related measures in the AI infrastructure is the most feasible way to ensure tamper-proofing. Blockchain is suited for this task because an immutable ledger can record any and all changes pertaining to an AI model’s metadata and training datasets.

Let’s say a disgruntled employee retrains their company’s AI chatbot with factually inaccurate information. This could mean anything from negatively impacting the quality of customer service, to spitting out offensive language, to disclosing sensitive user information. Without governance and auditability, there’s no effective way to establish when and how an AI model shifted in behavior. Companies find themselves looking for a needle in a haystack when trying to identify exactly which datasets were altered—and who was responsible. Blockchain’s serialized system can effortlessly surface these insights by essentially going back in time to revisit previous iterations of the AI model.

Q: Using blockchain technology, we’re building a new AI governance tool with IBM. What are your thoughts on the upcoming launch?

Working with IBM on this AI governance SaaS offering is a huge turning point for Casper Labs. IBM was founded in 1911, and has since been running at the forefront of enterprise innovation. It’s a highly secure, widely respected established firm. To collaborate with us on this major initiative, they must clearly see the benefit of what Casper Labs is doing at the intersection of blockchain and AI.

The timing couldn’t be more perfect. As new regulations are being passed and large language models along with multimodal generative AI are going mainstream, this is the right time to create defined safety guardrails—read: governance—for AI technology. Just as the Food and Drug Administration (FDA) exists to make sure a new food product or medicine aligns with safety protocols, we’ll also need a standardized approach to monitor AI.

It’s still early days, but I’m confident that this tool will expand access not only to AI itself, but also the ability to mitigate risk and scale the use of trustworthy data.