XM Cyber GenAI – Empowering Users with Immediate Insights

Posted by: Dale Fairbrother
March 04, 2024
Getting your Trinity Audio player ready...

In this article, we will discuss how to unlock the hidden knowledge from XM Cyber’s Cloud Data Lake, via our new GenAI user interface. This new interface grants you the power to act on your cybersecurity risk posture with continuous tailored insights about your dynamic attack surface.

But before we jump into the new feature, let’s first talk about some things you may already know a bit about. Generative artificial intelligence (GenAI) is the use of AI to generate text, images or data responses to a query using generative models. Step one for any new GenAI program or system is to learn or be trained on the patterns and data structure, using common input and output analysis – as in to provide a set of questions and answers from a dataset that can be used to model future responses. The AI then learns over time how to interpret and then respond to similar questions or queries.

AI and machine learning have been around for some time, and are commonly used by cybersecurity companies to analyze and process large quantities of data within their solution and return valuable and contextual insights about alerts, threats and security posture. However, the role of GenAI is more focused on the front-end application of AI, versus the back-end machine learning element.

The goal, of course, is to make our lives easier and to try and do some of the heavy lifting for data processes, analysis or reporting, which would often take a human many hours or days to do for themselves.

ChapGPT is probably the most well-known and accessible form of GenAI – you ask it a question (pretty much any question) and it generates a very detailed response with the output tailored to how you asked the question and how you specified the parameters of the response. 

Getting Started with GenAI

For anyone looking to develop a GenAI system, you may well ask the question of how to train the model. How do you first teach the AI engine what it needs to know in order to respond correctly to the queries it will be asked in the future? The larger the source and the quality of the data for the AI to consume, the more accurate the response will become over time, often with a small amount of tuning and handholding along the way.

One common methodology for training GenAI has been to use the human-driven responses to Captchas – these were originally based on object character recognition (OCR) to help identify letters, words and numbers. This eventually evolved into selecting image boxes that contain certain types of road signs, traffic lights, or bicycles, for example.

Originally Captchas were designed to help distinguish a human from a machine or bot for online websites. However, modifying them to train GenAI solutions, now means AI is as good or better at completing them than we are, and as such, makes them no longer relevant for their original intended purpose.

The end result of using road signal-based captchas to train AI now means that your self-driving car can now read and interpret traffic signals and other road users much more accurately. And as the need for more and more sophisticated AI systems are needed, there is an increasing need to find more innovative and elaborate ways to train the AI models.

​​To train our new GenAI model, our Customer Success team correlated common questions and answers from across our customer base and our support knowledgebase to feed into the new system, and train the GenAI to accurately respond to queries relating to the vast data lake of entity data our platform provides.

Questions such as:


  1. Which of my Amazon EC2 instances are currently exposed to the internet, correlated across their geographical or regional deployment, along with how long they have been running?
  2. Show me a list of all workstations that aren’t running an EDR agent.
  3. How many users does each AWS account have?
  4. I want to see all my disabled active directory users who have domain admin permissions.
  5. Show me a list of Kubernetes namespaces and their associated cluster name.


The Benefits of GenAI in the XM Cyber Platform

The XM Cyber Continuous Exposure Management platform has over 48 million sensors deployed worldwide that continuously gather contextual information about enterprise entities, and analyze them in our cloud data lake. The entity data is correlated against additional sources such as threat intelligence and vulnerability databases. We then run millions of attack simulations to mirror different threat campaigns across the data lake to provide customers with unique insights into the attack paths and exposures that present the highest risk to their critical assets.

The vastness of the data in the data lake, and the broad variety of the security insights our platform provides can sometimes be a little overwhelming for customers to visualize and process.

As such, our customer success team who work hand in hand with our customers, are often asked similar questions about the different types of data and information we can provide. We always put the needs of our customers first, and to further extend the continuous care offered by our 24/7 service, we wanted to build a new GenAI model that can specifically address these common but often complex questions about the rich contextual insights stored in the platform, with the primary goal of helping customer unlock the power of our comprehensive data lake of security intelligence. 

As discussed in this and other articles, we gather large amounts of data, whether it’s from our sensors, or via API from IaaS cloud providers like Azure, GCP and AWS, PaaS like Kubernetes and more. The large number of vendors supported means XM Cyber has a wide variety of security insights about the network, without the silos of more limited point solutions.

Navigating the huge amount of insights and data can be challenging. Our new chat AI interface can drastically reduce the time it takes to understand and extract the specific insight you are looking to uncover. This means you can get results from a simple language interface, without needing to wait for data scientists or security engineers, and can immediately access detailed real-time insights about the current risk state and security posture of your entities.

Want to find out more about the new GenAI interface and how it can empower your organization? Check out this press release!

Dale Fairbrother

Find and fix the exposures that put your critical assets at risk with ultra-efficient remediation.

See what attackers see, so you can stop them from doing what attackers do.