6 min read
Inside AIR: A Look at What Powers the Bedrock Data Security Platform
Ganesha Shanmuganthan : Apr 12, 2024 2:00:00 PM
Recently the team at Bedrock Security announced our comprehensive data security platform, designed to effectively manage data risk introduced by cloud and generative AI (GenAI). The Bedrock platform continuously discovers, manages, and protects sensitive data, powered by our data AI Reasoning (AIR) Engine. Uniquely, AIR automatically understands what data is most material to an enterprise, enabling organizations to protect their most valuable assets without slowing down data growth or impede the use of data to accelerate business success.
In an increasingly digital world, data security today is more important than ever. The volume and variety of data collected, stored, analyzed, and transmitted is already vast and continues to grow, fueled by the introduction of cloud and generative AI (GenAI). This data has intrinsic value, driving business decisions and strategies, but also motivating malicious actors to carry out near-constant attacks. According to the 2023 Verizon Data Breach Investigations Report, 95% of breaches are financially driven, either to steal money directly or sell sensitive stolen information on the black market.
Three Hardest Modern Data Security Problems: Solved
This rapidly changing threat and technology environment introduces three hard-to-solve data security challenges:
-
Rapidly growing and moving data volumes: Discovering and classifying sensitive data is difficult, especially as it grows, moves, and is duplicated across distributed environments.
-
Understanding structured and unstructured data types: Structured data is easier to analyze and categorize quickly than unstructured data, but all organizations have a mix of both, and traditional data security solutions are rigid in how they identify data types, making it hard to protect appropriately.
-
Handling massive data volumes: Analyzing large volumes of data is typically slow, inaccurate, and expensive. To maintain accuracy, you need to rescan often, but rapidly growing data volumes and slow analysis times make this impossible. Fixed sampling looks at just a small percentage of all the data, which is much faster, but far less accurate, leaving sensitive data unprotected.
Bedrock Security designed a solution that addresses all three problems by developing the AIR engine and delivering its solution on distributed, serverless processing architecture. Bedrock’s platform performs dynamic, adaptive sampling and applies AI reasoning to accelerate data classification. AIR helps organizations avoid the cost, inaccuracy, and friction resulting from today’s legacy DSPM solutions.
"Bedrock’s innovation excites me and aligns with how I think about protecting data and managing risk effectively."
Mukund Sarma, Sr Director Product Security, Fastest Growing US Fintech Co.
Enable Accuracy and Visibility
Every organization conducts risk assessments in order to understand where vulnerabilities lie and how to remediate them. This may be part of meeting compliance requirements, in response to a data breach, or as part of ongoing data security efforts. Risk assessments must identify, evaluate, and prioritize risks, including risks associated with storing, processing, and transmitting data within an organization. For many organizations, simply identifying all the data as it is created, modified, and moved across complex, distributed data environments is a significant challenge. DSPMs and legacy solutions struggle to maintain visibility across these cloud platforms and third-party services, resulting in gaps in data discovery and classification that increase risk.
Many traditional solutions rely on static rules (also known as regular expressions or regexes) to identify and categorize data, which will inevitably miss anything that doesn’t conform to those rigid rules, leaving much of the data unidentified and unprotected. Solutions that use brute force to review every line of file to classify data will accurately assess the data, at least if it’s appropriately structured, but this process is expensive and will take months to complete, so the data will always be stale. Simply sampling a small percentage of the data will not result in the same cost and speed issues, but inevitably the solution will not classify all data, leaving some of your sensitive information vulnerable. DSPMs and legacy solutions have difficulty identifying and classifying unstructured data and cannot handle the exponential growth of data, compromising performance.
The Bedrock platform uses your infrastructure’s existing APIs to discover structured and unstructured datasets across datastores, accounts, and infrastructure providers, giving you comprehensive data insights, including data classification, lineage, and data maps/flows, including new data types and contexts. Bedrock dynamically adjusts the sampling up and down, based on the characteristics of each specific file and data store. For example, for a structured dataset, such as a database, Bedrock can analyze a smaller sample and determine with an extremely high degree of certainty what type of data is stored in that database, classifying that data accurately and quickly. For unstructured data, Bedrock uses a larger sample to understand it. If the AIR engine identifies sensitive data in a particular folder, the sample size is increased to ensure that all sensitive data is identified and classified. The ability to adapt the scan based on the types of data and what is discovered accelerates the data discovery process and ensures accuracy and visibility into all your data.
Beyond finding and classifying your data and how the data flows, you also need to understand what it contains and who owns it. Bedrock uses large language models (LLMs) and other artificial intelligence (AI) and machine learning (ML) approaches to analyze the data and determine what the data type a file is, what it contains, and who owns it. For example, banks use social security numbers for customers in account information, but they also store social security numbers for employees. While these are the same data type, the use of and ownership of that data is completely different. The AIR engine helps you classify the data appropriately and orchestrate what permissions you want to set based on the business purpose of the data.
Achieve Data Security & Compliance
The ability to understand different data types and uses enables the Bedrock platform to generate risk and impact scores based on each data volume and the sensitivity of the data it contains. That rating includes the impact on your business if that data gets leaked, helping you to prioritize risk. DSPM and legacy solutions provide incomplete visibility, making it difficult to enforce security policies uniformly or ensure compliance across all assets. The AIR engine identifies and classifies all data, then creates an impact score to tell you which databases and datastores are most important to protect based on how critical the data is. Protecting your data effectively also requires continuous security assessments to enable real-time data detection and response (DDR) of anomalous activities due to ever-changing data and threats. The Bedrock platform conducts ongoing data discovery, analysis, and classification to enable organizations to ensure that any security or compliance violations are addressed quickly.
“Within a week of implementing Bedrock, we noticed some unexpected data in our lowest development environment. This prompted us to review our system configurations and ensure everything was aligned with our protocols,”
Andrew Kuhn, Product Security Engineer, House Rx
And while complying with regulatory requirements is essential, ensuring compliance with internal company policies is also incredibly important. The Bedrock platform allows you to apply your own policies and constraints, then analyze those policies to ensure that you are meeting your own data security requirements. Bedrock makes that process easy by offering Trust Boundaries. These boundaries provide a fast, adaptive, and automated way to define, alert, and contain data policy violations based on exactly what’s important for your business. A few examples of how you can use Trust Boundaries include:
-
Delineating who should have access to a specific set of data
-
Ensuring no sensitive data can be copied outside of the production environment
-
Preventing data from existing outside of defined regions
Bedrock’s Trust Boundaries allow you to write policies in human languages, giving you an easy and flexible way to manage data access securely. The AIR engine’s dynamic understanding of what Human Resources data means allows you to worry less about defining each data grouping and focus on how it can be used and who can access it. Your policies become less brittle and more powerful.
Minimize the Data Risk Surface
The Bedrock platform builds on its ability to conduct rapid, accurate risk analysis to understand all your data and manage policies and violations at scale to minimize your risk surface. It provides insights into what problems exist in your environment and provides recommendations on how to fix them. There are three broad categories that help you reduce risk and remediate problems quickly:
-
Reducing Data: Identify all the data you have, determine what data you don’t need, and remove it or move it to cold storage so it is extremely hard to access.
-
Minimizing Access: Access control is an important way to minimize data risk. Once you understand your data, how it flows, and who has access to it, you can determine who needs (and uses) access to data and limit access to those who truly need it.
-
Hardening Data: Encrypting and tokenizing data are two ways that you can make data harder to steal, but to do that, you need continuous insight into what data exists across your environments.
Bedrock’s AIR engine minimizes the data risk surface by finding stale or ghost data so it can be eliminated, assesses the impact of identity access to enable permissioning in line with least privilege principles, and hardens to the data to minimize the impact of a data breach. Through API integrations, Bedrock allows you to click a button to remediate risk in Microsoft Azure, Amazon Web Services, and Google Cloud Platform. Keep in mind that the more users have access to a sensitive data set, the more paths an attacker has available to compromise the account, and ultimately the data itself. The Bedrock platform enables you to minimize this risk surface by reducing data, minimizing access, and hardening data, then makes it easy to remediate any issues if or when they arise.
Build a Robust Data Security Program
Using Bedrock’s platform and AIR engine, you can build and grow a strong data security program using a solution that is designed for today’s data security challenges. From escalating cyber threats to rapidly increasing data volumes, Bedrock enables comprehensive impact and risk analysis to quickly reduce identity overprovisioning for data access, minimize stale data, and track and contain intellectual property (IP).
For example, GenAI makes it easy to generate text, images, videos, and more using training data. But everything that goes in becomes part of that model’s training dataset, and it’s difficult to control the output. For organizations to leverage GenAI models, enterprises must be confident that these models are compliant and secure and will not output sensitive information. The Bedrock platform is designed to automatically learn what data is most material to each business, put boundaries between sensitive data and GenAI models, and enable organizations to bring GenAI to customers safely and responsibly.
The Bedrock platform enables organizations to address the full lifecycle of how customer data is handled by providing accurate, comprehensive visibility, enabling organizations to create custom data perimeters, and proactively reducing data risk through AI reasoning and integrations.
Learn more in the Bedrock Security platform data sheet.