Organizations frequently cite data collection as the primary challenge when implementing security analytics solutions. Gathering data from multiple sources across diverse environments, such as AWS, other cloud platforms, or on-premises, is complex. The volume and variety of data continue to grow, resulting in an ever-evolving set of sources that must be continuously ingested and managed for analysis. For example, within AWS alone, organizations often need to gather VPC Flow Logs, WAF logs, CloudTrail events, and security findings from services such as Amazon GuardDuty and AWS Security Hub. Security teams want to centralize this data across all AWS regions and accounts. However, achieving this centralization has become increasingly complex due to varying permissions requirements and the distributed nature of the data.
The next challenge is that logs and alerts come in various formats. To analyze the data effectively, security teams must first understand the format of each individual data source. To extract meaningful insights, you first have to map and normalize every format before running any analysis. This process becomes increasingly complex as new data sources are introduced, requiring time to learn and interpret their specific formats. As a result, the task becomes time-consuming and resource-intensive.
Once your data is normalized into a consistent format and stored centrally, the next step is integrating that data into an analytics tool. This requires building a data pipeline. In practice, this means extracting the necessary fields, transforming them into a format compatible with the analytics platform, and loading them into an index that supports fast querying and analysis. This process, known as Extract, Transform, and Load (ETL), requires specialized knowledge in data analytics. Designing and implementing effective ETL pipelines is time-consuming and requires significant technical skill.
Another significant challenge with data pipelines is cost. As data volumes increase, the process of ingesting it into an analytics tool becomes expensive. Organizations often face difficult trade-offs: whether to ingest all available data at the risk of exceeding their budget or to limit ingestion to control costs. In this case, excluding data from ingestion leads to reduced visibility, making it harder to evaluate the organization's posture to potential threats. As a result, organizations must strike a careful balance between managing costs and ensuring broad data access to maintain effective security monitoring.
Organizations facing these challenges often use multiple tools to manage their data analytics needs. A common approach is to ingest high-value or real-time data into a primary analytics tool for critical processing. Less important data may be stored separately for later access and analysis. For example, many organizations use Amazon Athena to query data stored in S3 buckets, and then use tools like Amazon QuickSight for visualization. However, this approach requires teams to develop skills across multiple tools and invest time in training on each tool.
All of this ultimately means that security teams must undertake complex data analytics initiatives to perform their core security functions. As a result, they need to spend a significant amount of time on data management rather than focusing on security operations and responding to threats.
Amazon OpenSearch Service offers zero-ETL integration with Amazon Security Lake, enabling immediate security insights directly from data stored in Security Lake without building complex data pipelines. With in-place search capabilities, users can query the data directly from Amazon OpenSearch. Also, on-demand indexing allows selective ingestion of specific datasets based on the use case, unlocking the powerful capabilities of Amazon OpenSearch analytics. This approach helps reduce the volume of data ingested, which leads to lower operational costs.
Centralizing security data is made simple with Amazon Security Lake. It automatically aggregates various AWS log sources into a purpose-built security data lake, optimized for different use cases. Customer feedback revealed a common challenge: security data was often locked into specific analytics tools, limiting flexibility and reuse. The data and analytics were tightly coupled to specific tools. Security Lake was designed to address this by democratizing access to security data. It gives organizations full ownership and control over their logs and events, allowing them to use the analytics tools of their choice or even multiple tools.
Amazon Security Lake normalizes data into the Open Cybersecurity Schema Framework (OCSF), a widely adopted open standard designed to unify security data formats across the industry. OCSF is backed by over 900 contributors and 200 organizations, including major security vendors, government agencies, academic institutions, and enterprises, all collaborating to develop a standard schema. The primary benefit of OCSF is its vendor-agnostic taxonomy, which allows organizations to ingest and analyze security data without the need for custom transformations or normalization. This standardization accelerates data onboarding and enables consistent analytics across diverse data sources, vendors, and environments.
Amazon OpenSearch Service is a leading search and log analytics platform that enables organizations to extract maximum value from their data. As the AWS-managed service for the open-source OpenSearch project, it offers advanced capabilities including alerting, visualization, dashboarding, anomaly detection, and security analytics. The service is designed for secure, efficient, and scalable analysis of business, security, and operational data. With built-in integrations, such as with Amazon Security Lake, OpenSearch Service provides a cost-effective solution for end-to-end data analysis across diverse use cases.
The Zero-ETL integration for Amazon Security Lake allows users to query and analyze security data directly within Security Lake, without duplicating data across tools or building and maintaining custom data pipelines. The setup process is straightforward and supports in-place querying, eliminating the need for prior data analysis. For recurring queries, on-demand indexing ensures faster execution and improved performance. Pre-built security queries and dashboards provide immediate insights and simplify exploration of your data.
Simple setup
With a simple setup process, you start by creating a subscriber in Amazon Security Lake and then configuring a Security Lake data source within Amazon OpenSearch Service. This automatically provides all necessary components, including a serverless collection and dashboards. Also, it creates a dedicated OpenSearch UI workspace designed specifically for security use cases. This new interface also allows you to bring in other data sources, such as CloudWatch Logs, enabling complete end-to-end visibility from a single, unified view.
In-place querying of Security Lake data
With in-place querying of Security Lake data, the same data is duplicated across multiple systems. You can query it directly and view it alongside application logs and other datasets already in Amazon OpenSearch Service. This integration leverages the Apache Iceberg format, which enables faster query performance, providing immediate and efficient access to your data. You can use SQL or OpenSearch to run analytical queries across tables. The data is structured using the Open Cybersecurity Schema Framework, offering a standardized layer that simplifies data access and query development.
On-demand indexing
On-demand indexing eliminates concerns about long-running queries. If you need fast, repeated access to specific data, such as during a security investigation, you can create an on-demand index. For example, suppose you are investigating a GuardDuty finding related to an EC2 instance and want to analyze VPC Flow Logs for a specific timeframe. In that case, you can run a direct query against Security Lake, ingest the results, and store them in an indexed view. And all subsequent queries on that data will execute in milliseconds, enabling fast access for visualizations, alerting, and deeper analysis.
Pre-built queries and dashboards
Pre-built queries and dashboards provide immediate, out-of-the-box insights without any setup required. The integration includes over 200 pre-built queries and dashboards covering key data sources such as VPC Flow Logs, WAF Logs, and AWS CloudTrail Management Events.
When it comes to data collection, Amazon Security Lake streamlines the process of centralizing security data across AWS environments. With just a few clicks, users can enable multiple log sources across various accounts and regions. Also, Security Lake supports the integration of third-party data. This allows organizations to quickly establish a centralized repository for all their data, simplifying collection and management.
Security Lake automatically normalizes collected data into the Open Cybersecurity Schema Framework (OCSF), providing a consistent structure across diverse data sources. It also optimizes the data for efficient storage and query performance, ensuring that when it's time, the data is prepared for fast and efficient access.
With the Zero-ETL integration, setting up a connection between Security Lake and Amazon OpenSearch Service is easy, and no data pipelines are required. Once the connection is established, data can be queried directly from Security Lake using Amazon OpenSearch Service. This in-place querying approach eliminates the need to move or duplicate data.
In-place querying ensures complete visibility into all data stored in the Security Lake. Running queries directly from Amazon OpenSearch Service enables the ingestion of a larger volume of data into OpenSearch, thereby reducing compute requirements and lowering costs. On-demand indexing lets you decide, at query time, whether to ingest specific results into OpenSearch. This streamlined workflow makes it easy to determine what to index and when, all within the query interface.
Amazon OpenSearch Service provides a unified platform for real-time and historical security analysis. To simplify adoption, the integration includes pre-built queries and dashboards that deliver immediate security insights after the connection to Security Lake is established. These pre-built assets can be customized as needed, or users can build their own. Amazon OpenSearch Service offers robust analytics capabilities to support a wide range of security use cases.
This results in a substantial reduction in the time required for data management. Teams no longer need to create pipelines, run ETL jobs, or prepare data. Instead, they can concentrate on core security tasks such as monitoring, threat investigation, and incident response.
AWS Events