Global Query Tutorial

Global Query Tutorial

Understanding Uptycs Schema 

Introduction 

Welcome to this step-by-step tutorial on understanding and effectively using the Uptycs 
Schema. This guide will help you navigate the complex data structure of Uptycs, create effective 
queries, and unlock valuable insights from your cloud and on-premises environments. 
 
Uptycs provides a unified view of your cloud and on-premises assets, allowing you to monitor, 
analyze, and secure your infrastructure. To make the most of this platform, it's essential to 
understand how data is organized and how to construct queries that deliver accurate and 
meaningful results. 

Schema Overview 

Uptycs organizes data across multiple categories, each representing different cloud providers 
and technologies: 
 
-  AWS resources: Cloud resources from Amazon Web Services 
-  Azure resources: Cloud resources from Microsoft Azure 
-  GCP resources: Cloud resources from Google Cloud Platform 
-  Kubernetes resources: Container orchestration resources 
-  Cross-platform resources: Tables that span multiple platforms (processes, packages, 
etc.) 

Table Types and Their Uses 

The Uptycs schema contains different types of tables that serve various purposes. 
Understanding these table types is crucial for constructing effective queries. 

_current Tables vs. Historical Tables 

Uptycs uses two primary table types: _current tables and historical tables (tables without the 
_current suffix). 

_current Tables 

-  Represent the most recent state snapshot 
-  Deduplicated, showing only the current state 
-  Default choice for most queries 
-  Used when you don't need historical data 
-  Example: aws_ec2_instance_current 

Historical Tables 

-  Contain historical/time-series data 
-  Include a upt_day column (date format stored as integer, e.g., 20250301) 
-  May have multiple entries per object 
-  Used for tracking changes over time 
-  Required when you specify "historic" in your requirements 
-  Example: aws_ec2_instance (without "_current") 
-  Not all tables follow the "*_current" naming pattern (e.g., upt_assets table) 

Example: Comparing _current vs. Historical Tables 

Current state only (using _current table): 
 
SELECT  
    instance_id, 
    instance_type, 
    private_ip_address, 
    public_ip_address, 
    state_name 
FROM  
    aws_ec2_instance_current 
WHERE  
    account_id = '123456789012' 
    -- No upt_day filter needed since _current tables only contain the latest 
state 
    -- _current tables are already filtered and deduplicated 

Historical view (using historical table): 
 
SELECT  
    instance_id, 
    instance_type, 
    private_ip_address, 
    public_ip_address, 
    state_name, 
    upt_day 
FROM  
    aws_ec2_instance 
WHERE  
Unset
    account_id = '123456789012' 
    -- upt_day filter is REQUIRED for historical tables - NOT OPTIONAL 
    -- Queries on historical tables will fail or return excessive data without 
this filter 
    AND upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
ORDER BY 
    instance_id, 
    upt_day 

Understanding the upt_day Column 

The upt_day column is a critical component of historical tables that enables historical analysis: 
 
-  Format: Integer representation of a date (YYYYMMDD) 
-  Example: March 1, 2025 would be represented as 20250301 
-  Used to filter data for specific time ranges 
-  REQUIRED for all queries on historical tables - queries will fail or produce excessive 
data without this filter 
-  Critical for both correctness and performance 

Best Practices for upt_day Filtering 

Always include a upt_day filter when querying historical tables - this is not optional: 
 
-- Last 7 days of data 
WHERE upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d') 
AS INT) 
 
-- Current day only 
WHERE upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d') 
AS INT) 
 
-- Specific date range 
WHERE upt_day BETWEEN CAST('20250201' AS INT) AND CAST('20250228' AS INT) 

Linking Assets with upt_asset_id 

The upt_asset_id field is a crucial identifier that links compute assets with runtime 
information: 
 
-  Uniquely identifies cloud resources across the Uptycs schema 
-  Enables joining compute resources with installed software, processes, and security 
findings 
-  Only available for compute resources (EC2, Lambda, VMs, etc.) 
-  Not available for non-compute resources like VPCs, security groups, etc. 

Working with Mixed Table Types 

Sometimes you need to join _current tables with historical tables. Here's how to handle these 
situations correctly: 

Example: Joining _current with Historical Tables 
SELECT  
    ec2.instance_id, 
    ec2.private_ip_address, 
    proc.name AS process_name, 
    proc.pid, 
    proc.cmdline 
FROM  
    aws_ec2_instance_current ec2 
    -- Using _current table for EC2 instances (no upt_day needed) 
JOIN  
    processes proc ON ec2.upt_asset_id = proc.upt_asset_id 
    -- Joining tables using upt_asset_id links instances with their processes 
WHERE  
    proc.name LIKE '%nginx%' 
    -- Filtering for processes with "nginx" in the name 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- IMPORTANT: Apply upt_day filter only to historical tables (processes) 
    -- Without this filter, the query would process excessive historical data 

Key points: 
 
-  Apply the upt_day filter only to the historical table (processes) 
-  The _current table (aws_ec2_instance_current) doesn't need a upt_day filter 
Unset
Unset
-  This approach joins current EC2 state with current process information 
-  Always include a comment explaining why the upt_day filter is used 

Working with JSON Fields 

Many cloud resources store complex data in JSON format, particularly in GCP and Azure. 
Understanding how to extract and filter JSON data is essential for effective cloud resource 
analysis. 

Example: Extracting Tags from AWS Resources 
SELECT  
    instance_id, 
    JSON_EXTRACT_SCALAR(tags, '$.Name') AS name_tag, 
    JSON_EXTRACT_SCALAR(tags, '$.Environment') AS environment_tag, 
    JSON_EXTRACT_SCALAR(tags, '$.Owner') AS owner_tag 
FROM  
    aws_ec2_instance_current 
WHERE  
    JSON_EXTRACT_SCALAR(tags, '$.Environment') = 'Production' 

Example: Working with GCP Network Interfaces 
SELECT  
    name AS instance_name, 
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS primary_ip, 
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].subnetwork') AS subnet, 
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') AS 
public_ip 
FROM  
    gcp_compute_instance_current 
WHERE  
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') IS 
NOT NULL 

Cross-Cloud Queries and Data Consistency 

When querying across multiple cloud providers, you need to handle differences in schema 
structure and data types: 

Example: Finding Instances Across Multiple Clouds 
-- AWS EC2 Instances 
SELECT  
    'AWS' AS cloud_provider, 
    instance_id AS resource_id, 
    instance_type AS resource_type, 
    private_ip_address AS private_ip, 
    public_ip_address AS public_ip, 
    state_name AS state 
FROM  
    aws_ec2_instance_current 
    -- Using _current tables to get latest state 
 
UNION ALL 
 
-- GCP VM Instances 
SELECT  
    'GCP' AS cloud_provider, 
    name AS resource_id, 
    machine_type AS resource_type, 
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS private_ip, 
    JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') AS 
public_ip, 
    status AS state 
FROM  
    gcp_compute_instance_current 
    -- Using _current tables to get latest state 
 
UNION ALL 
 
-- Azure VM Instances 
SELECT  
    'Azure' AS cloud_provider, 
    name AS resource_id, 
    JSON_EXTRACT_SCALAR(hardware_profile, '$.vmSize') AS resource_type, 
    JSON_EXTRACT_SCALAR(network_profile, 
'$.networkInterfaces[0].privateIPAddress') AS private_ip, 
    JSON_EXTRACT_SCALAR(network_profile, 
'$.networkInterfaces[0].publicIPAddress') AS public_ip, 
    provisioning_state AS state 
FROM  
    azure_compute_vm_current 
    -- Using _current tables to get latest state

Key points: 
 
-  Notice how we handle JSON fields in GCP and Azure using JSON extraction functions 
-  Column names differ across providers (instance_id vs. name) 
-  Data structure differs (scalar values in AWS vs. JSON arrays in GCP/Azure) 
-  Using common column aliases creates a consistent output format 

Advanced Security and Compliance Queries 
Let's explore advanced scenarios that combine different table types and demonstrate powerful 
security and compliance use cases: 
 
MySQL Installations Across Package Types and Platforms 
This query provides a comprehensive view of MySQL installations across different cloud 
providers (AWS, GCP, Azure) and package management systems (DEB, RPM), enabling 
effective software inventory and compliance monitoring. 
 
Detection Coverage: 
 
-  MySQL package installations 
-  Package versions and sources 
-  Cross-cloud deployment tracking 
-  Operating system variations 
-  Package management systems 
 
Key Components: 
 
1.  Package Information 
 
-  Package names and versions 
-  Installation sources 
-  Package management type 
 
2.  Cloud Resource Context 
 
-  Instance identification 
-  Network information 
-  Cloud provider details 
 
3.  System Details 
 
-  Operating system versions 
-  Hostname information 
-  Asset correlation 
 
Performance Optimization: 
 
-  Efficient UNION ALL operations 
-  Proper historical table filtering 
-  Optimized package name matching 
-  Strategic table joins 
 
-- Debian Packages on AWS EC2 
SELECT  
    'AWS' AS cloud_provider, 
    'DEB' AS package_type, 
    ec2.instance_id AS resource_id, 
    ec2.private_ip_address, 
    sys.hostname, 
    sys.upt_os_version AS os_version, 
    deb.name AS package_name, 
    deb.version AS package_version, 
    deb.source AS package_source 
FROM  
    aws_ec2_instance_current ec2 
JOIN  
    system_info sys ON ec2.upt_asset_id = sys.upt_asset_id 
    -- Using system_info for host information 
JOIN  
    deb_packages deb ON ec2.upt_asset_id = deb.upt_asset_id 
WHERE  
    deb.name LIKE '%mysql%' 
    -- Matching on MySQL packages (mysql-server, mysql-client, mysql-common, 
etc.) 
    AND deb.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical deb_packages table 
    AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical system_info table 
 
UNION ALL 
 
-- RPM Packages on AWS EC2 
SELECT  
    'AWS' AS cloud_provider,'RPM' AS package_type, 
    ec2.instance_id AS resource_id, 
    ec2.private_ip_address, 
    sys.hostname, 
    sys.upt_os_version AS os_version, 
    rpm.name AS package_name, 
    rpm.version AS package_version, 
    rpm.vendor AS package_source 
FROM  
    aws_ec2_instance_current ec2 
JOIN  
    system_info sys ON ec2.upt_asset_id = sys.upt_asset_id 
    -- Using system_info for host information 
JOIN  
    rpm_packages rpm ON ec2.upt_asset_id = rpm.upt_asset_id 
WHERE  
    rpm.name LIKE '%mysql%' 
    -- Matching on MySQL packages (mysql-server, mysql-client, mysql-libs, 
etc.) 
    AND rpm.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical rpm_packages table 
    AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical system_info table 
 
UNION ALL 
 
-- Packages on GCP instances 
SELECT  
    'GCP' AS cloud_provider, 
    pkg.package_type, 
    gcp.name AS resource_id, 
    JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].networkIP') AS 
private_ip_address, 
    sys.hostname, 
    sys.upt_os_version AS os_version, 
    pkg.name AS package_name, 
    pkg.version AS package_version, 
    pkg.vendor AS package_source 
FROM  
    gcp_compute_instance_current gcp 
JOIN  
    system_info sys ON gcp.upt_asset_id = sys.upt_asset_id 
    -- Using system_info for host information 
JOIN  
    -- Using a subquery to unify package sources 
    ( 
        SELECT upt_asset_id, 'DEB' AS package_type, name, version, source AS 
vendor, upt_day 
        FROM deb_packages 
        WHERE name LIKE '%mysql%' 
         
        UNION ALL 
         
        SELECT upt_asset_id, 'RPM' AS package_type, name, version, vendor, 
upt_day 
        FROM rpm_packages 
        WHERE name LIKE '%mysql%' 
    ) pkg ON gcp.upt_asset_id = pkg.upt_asset_id 
WHERE  
    pkg.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical package data 
    AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical system_info table 
 
UNION ALL 
 
-- Packages on Azure VMs 
SELECT  
    'Azure' AS cloud_provider, 
    pkg.package_type, 
    az.name AS resource_id, 
    JSON_EXTRACT_SCALAR(az.network_profile, 
'$.networkInterfaces[0].privateIPAddress') AS private_ip_address, 
    sys.hostname, 
    sys.upt_os_version AS os_version, 
    pkg.name AS package_name, 
    pkg.version AS package_version, 
    pkg.vendor AS package_source 
FROM  
    azure_compute_vm_current az 
JOIN  
    system_info sys ON az.upt_asset_id = sys.upt_asset_id 
    -- Using system_info for host information 
JOIN  
    -- Using a subquery to unify package sources 
    ( 
        SELECT upt_asset_id, 'DEB' AS package_type, name, version, source AS 
vendor, upt_day 
        FROM deb_packages 
        WHERE name LIKE '%mysql%' 
         
        UNION ALL 
         
        SELECT upt_asset_id, 'RPM' AS package_type, name, version, vendor, 
upt_day 
        FROM rpm_packages 
        WHERE name LIKE '%mysql%' 
    ) pkg ON az.upt_asset_id = pkg.upt_asset_id 
WHERE  
    pkg.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical package data 
    AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for the historical system_info table 
ORDER BY 
    cloud_provider, 
    resource_id, 
    package_name 
 
Finding EC2 Instances with Vulnerable MySQL Packages 
This query identifies EC2 instances running potentially vulnerable MySQL packages by 
correlating instance data with vulnerability information, helping security teams prioritize patching 
and remediation efforts. 
 
Detection Coverage: 
 
-  MySQL package vulnerabilities 
-  Critical and high severity CVEs 
-  Package version tracking 
-  Cross-instance correlation 
 
Key Components: 
 
1.  Vulnerability Analysis 
 
-  CVE identification 
-  Severity assessment 
-  Package version tracking 
 
2.  Package Context 
 
-  Package name verification 
-  Version information 
-  Installation details 
 
3.  Instance Correlation 
 
-  Resource identification 
-  Network context 
-  Asset tracking 
 
Performance Optimization: 
 
-  Uses _current tables for EC2 instances 
-  Efficient vulnerability joins 
-  Proper historical table filtering 
-  Optimized severity filtering 
 
SELECT  
    ec2.instance_id,ec2.private_ip_address, 
    vuln.cve_name, vuln.package_name, vuln.package_version, 
    vuln.advisory_severity, vuln.package_type 
FROM  
    aws_ec2_instance_current ec2 
    -- Using _current table for EC2 to get current state of instances 
JOIN  
    vulnerabilities vuln ON ec2.upt_asset_id = vuln.upt_asset_id 
    -- Joining on upt_asset_id links instances with their vulnerabilities 
WHERE  
    vuln.package_name LIKE '%mysql%' 
    -- Focusing on MySQL-related vulnerabilities 
    AND vuln.advisory_severity IN ('critical', 'high') 
    -- Filtering for only high-severity vulnerabilities 
    AND vuln.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- IMPORTANT: Apply upt_day filter to vulnerabilities (historical table) 
    -- This filter is REQUIRED, not optional 
ORDER BY  
    ec2.instance_id, 
    vuln.advisory_severity 
    -- Ordering by instance and severity for better readability

Finding Cloud Resources with Network Connections 
This query provides comprehensive visibility into network connections across cloud resources, 
helping identify active network communications and potential security concerns. 
 
Detection Coverage: 
 
-  Active network connections 
-  Process-to-network mapping 
-  Cross-instance communication 
-  Protocol and port usage 
 
Key Components: 
 
1.  Network Connection Analysis 
 
-  Socket state tracking 
-  Port monitoring 
-  Address resolution 
 
2.  Process Context 
 
-  Process identification 
-  Resource utilization 
-  Network activity correlation 
 
3.  Cloud Resource Correlation 
 
-  Instance metadata 
-  Network interface details 
-  Cross-cloud visibility 
 
Performance Optimization: 
 
-  Uses _current tables for cloud assets 
-  Efficient process-socket joins 
-  Proper historical table filtering 
-  Optimized multi-table correlation 
 
SELECT  
    ec2.instance_id, 
    ec2.private_ip_address, 
    proc.name AS process_name, 
    sockets.local_address, 
    sockets.local_port, 
    sockets.remote_address, 
    sockets.remote_port, 
    sockets.state 
FROM  
    aws_ec2_instance_current ec2 
    -- Using _current table for EC2 instances (current state only) 
JOIN  
    processes proc ON ec2.upt_asset_id = proc.upt_asset_id 
    -- Joining with processes using upt_asset_id 
JOIN  
    process_open_sockets sockets ON proc.pid = sockets.pid AND 
proc.upt_asset_id = sockets.upt_asset_id 
    -- Joining process_open_sockets with processes using both pid and 
upt_asset_id 
    -- Double join condition ensures correct process-to-socket mapping 
WHERE  
    sockets.remote_port = 443 
    -- Filtering for HTTPS connections (port 443) 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Using current day filter for processes (historical table) 
    AND sockets.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Using current day filter for process_open_sockets (historical table) 
    -- IMPORTANT: Each historical table needs its own upt_day filter 
    -- _current tables (ec2) don't need upt_day filters 

Finding Instances with Exposed Sensitive Ports 
This query identifies cloud instances that have sensitive ports exposed to the internet through a 
combination of listening services and permissive security group rules, helping teams identify 
potential security vulnerabilities. 
 
Detection Coverage: 
 
-  Sensitive port exposure (SSH, RDP, databases) 
-  Public IP accessibility 
-  Security group misconfigurations 
-  Process-port relationships 
 
Key Components: 
 
1.  Port Analysis 
 
-  Service identification 
-  Protocol verification 
-  Binding interface checks 
 
2.  Security Group Evaluation 
 
-  Rule analysis 
-  CIDR range verification 
-  Protocol matching 
 
3.  Process Context 
 
-  Service identification 
-  Path verification 
-  Resource correlation 
 
4.  Network Exposure 
 
-  Public IP detection 
-  Interface binding analysis 
-  Port range validation 
 
Performance Optimization: 
 
-  Uses _current tables for cloud resources 
-  Efficient security group joins 
-  Proper historical table filtering 
-  Optimized port range checks 
 
SELECT  
    ec2.instance_id, 
    ec2.private_ip_address, 
    ec2.public_ip_address, 
    ec2.account_id, 
    ec2.region, 
    listen.port, 
    listen.protocol, 
    proc.name AS process_name, 
    proc.path AS process_path, 
    sg.security_group_name AS security_group 
FROM  
    aws_ec2_instance_current ec2 
    -- Using _current table for EC2 instances 
JOIN  
    listening_ports listen ON ec2.upt_asset_id = listen.upt_asset_id 
    -- Joining with listening ports information 
JOIN  
    processes proc ON listen.pid = proc.pid  
        AND listen.upt_asset_id = proc.upt_asset_id 
    -- Joining with process information 
JOIN  
    aws_ec2_security_group_current sg ON ec2.vpc_id = sg.vpc_id 
    -- Joining with security group information 
JOIN  
    aws_ec2_security_group_rule_current rule ON sg.security_group_id = 
rule.group_id 
    -- Joining with security group rules 
WHERE  
    listen.port IN (22, 3389, 1433, 3306, 5432, 27017, 6379, 9200, 8080, 8443) 
    -- List of potentially sensitive ports 
    AND ( 
        rule.cidr_ipv4 = '0.0.0.0/0' OR  
        rule.cidr_ipv6 = '::/0' 
    ) 
    -- Looking for rules that allow access from anywhere 
    AND ( 
        (rule.from_port <= listen.port AND rule.to_port >= listen.port) OR 
        rule.from_port = -1 -- All ports 
    ) 
    -- Ensuring the port is within the allowed range 
    AND listen.address = '0.0.0.0' 
    -- Looking for services binding to all interfaces 
    AND ec2.public_ip_address IS NOT NULL 
    -- Only including instances with public IPs 
    AND listen.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Current day filter for listening_ports - REQUIRED 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Current day filter for processes - REQUIRED 
ORDER BY  
    ec2.instance_id, 
    listen.port 

Detecting Unusual Process Activity Across Cloud Resources 
This query monitors and identifies potentially malicious or unusual process activity across cloud 
environments by detecting suspicious process patterns, command execution, and high-risk 
behaviors. 
 
Detection Coverage: 
 
-  Network utility processes 
-  Reverse shell attempts 
-  Base64 encoded execution 
-  Suspicious file locations 
-  Cross-cloud process activity 
 
Key Components: 
 
1.  Process Analysis 
 
-  Name and path verification 
-  Command line inspection 
-  Execution context tracking 
 
2.  Cloud Resource Context 
 
-  Provider identification 
-  Instance correlation 
-  System information 
 
3.  Security Pattern Detection 
 
-  Network tool usage 
-  Encoded command execution 
Unset
-  Temporary file execution 
-  Privilege context 
 
Performance Optimization: 
 
-  Efficient cloud resource joins 
-  Proper historical table filtering 
-  Multi-provider correlation 
-  Optimized process tracking 
 
SELECT  
    sys.hostname, 
    CASE 
        WHEN ec2.instance_id IS NOT NULL THEN 'AWS' 
        WHEN gcp.name IS NOT NULL THEN 'GCP' 
        WHEN az.name IS NOT NULL THEN 'Azure' 
        ELSE 'Other' 
    END AS cloud_provider, 
    COALESCE(ec2.instance_id, gcp.name, az.name) AS resource_id, 
    proc.name AS process_name, 
    proc.path, 
    proc.cmdline, 
    proc.start_time, 
    proc.uid, 
    proc.gid 
FROM  
    processes proc 
    -- Using processes table (historical) 
JOIN  
    system_info sys ON proc.upt_asset_id = sys.upt_asset_id 
    -- Joining with system information 
LEFT JOIN  
    aws_ec2_instance_current ec2 ON proc.upt_asset_id = ec2.upt_asset_id 
    -- Optional join with AWS resources 
LEFT JOIN  
    gcp_compute_instance_current gcp ON proc.upt_asset_id = gcp.upt_asset_id 
    -- Optional join with GCP resources 
LEFT JOIN  
    azure_compute_vm_current az ON proc.upt_asset_id = az.upt_asset_id 
    -- Optional join with Azure resources 
WHERE  
    ( 
        proc.name IN ('nc', 'netcat', 'ncat', 'socat', 'nmap') 
        -- Looking for network utility processes 
        OR proc.cmdline LIKE '%reverse-shell%' 
        -- Looking for potential reverse shells 
        OR proc.cmdline LIKE '%base64 -d%' 
        -- Looking for base64 decoded execution 
        OR proc.path LIKE '/tmp/%' 
        -- Processes running from temporary directories 
    ) 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Looking back 1 day for suspicious processes 
    -- REQUIRED upt_day filter for the historical table 
    AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- REQUIRED upt_day filter for system_info (historical table) 
    -- Each historical table must have its own upt_day filter 
ORDER BY  
    proc.start_time DESC 
    -- Most recent processes first 

Identifying Suspicious Outbound Network Connections 
This query detects potentially malicious outbound network connections from cloud instances by 
monitoring process-level network activity and identifying connections to known suspicious ports 
or IP ranges. 
 
Detection Coverage: 
 
-  Suspicious port usage (4444, 31337, etc.) 
-  Known malicious IP ranges 
-  Unusual process-to-network patterns 
-  Cross-region communications 
 
Key Components: 
 
1.  Network Connection Analysis 
 
-  Port monitoring 
-  IP range filtering 
-  Connection state tracking 
 
2.  Process Context 
 
Unset
-  Process identification 
-  Command line analysis 
-  Path verification 
 
3.  Cloud Asset Correlation 
 
-  Instance metadata 
-  Account context 
-  Regional information 
 
Performance Optimization: 
 
-  Uses _current tables for cloud assets 
-  Proper upt_day filtering 
-  Efficient join strategies 
SELECT  
    ec2.instance_id, ec2.private_ip_address, ec2.account_id, ec2.region, 
    proc.name AS process_name, proc.path AS process_path, proc.cmdline, 
    conn.local_address,conn.local_port,conn.remote_address, conn.remote_port, 
    conn.state 
FROM  
    aws_ec2_instance_current ec2 
    -- Using _current table for EC2 instances 
JOIN  
    process_open_sockets conn ON ec2.upt_asset_id = conn.upt_asset_id 
    -- Joining with open network connections 
JOIN  
    processes proc ON conn.pid = proc.pid AND conn.upt_asset_id = 
proc.upt_asset_id 
    -- Joining with process information 
WHERE  
    ( 
        conn.remote_port IN (4444, 31337, 8081, 8082, 8083, 8084, 8085) 
        -- Potentially suspicious ports 
        OR conn.remote_address LIKE '185.%' 
        -- Specific suspicious IP ranges 
    ) 
    -- Looking for connections to suspicious destinations 
    AND conn.state = 'ESTABLISHED' 
    -- Looking for established connections 
    AND NOT ( 
        -- Excluding known good addresses/services 
        conn.remote_address LIKE '23.21.227.%' OR -- AWS IPs 
        conn.remote_address LIKE '13.32.%' OR     -- CloudFront  
        conn.remote_address LIKE '13.33.%' OR     -- CloudFront 
        conn.remote_address LIKE '52.84.%' OR     -- CloudFront 
        conn.remote_address LIKE '52.85.%'        -- CloudFront 
    ) 
    AND proc.name NOT IN ('chrome', 'firefox', 'aws', 'gcloud', 'az') 
    -- Excluding common user applications 
    AND conn.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Current day filter for connections - REQUIRED 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Current day filter for processes - REQUIRED 
ORDER BY  
    ec2.instance_id, 
    proc.start_time DESC 

Detecting Possible Command and Control (C2) Communications 
This query identifies potential command and control (C2) traffic by analyzing network 
connections from cloud instances for patterns commonly associated with malicious command 
and control channels. 
 
Detection Coverage: 
 
-  Suspicious network connections 
-  High-risk ports and protocols 
-  Known C2 patterns and behaviors 
-  Unusual process-to-network relationships 
-  Cross-cloud C2 detection 
 
Key Components: 
 
1.  Network Analysis 
 
-  Connection state monitoring 
-  Port categorization 
-  Protocol verification 
 
2.  Process Context 
 
-  Binary verification 
-  Command-line analysis 
Unset
-  Process lineage 
 
3.  Risk Scoring 
 
-  Port risk assessment 
-  Destination reputation 
-  Connection patterns 
 
Performance Optimization: 
 
-  Efficient network connection joins 
-  Process context correlation 
-  Proper day filtering 
-  Targeted exclusion lists 
 
SELECT 
    'AWS' AS cloud_provider, 
    ec2.instance_id AS resource_id, 
    ec2.private_ip_address, 
    ec2.public_ip_address, 
    ec2.region, 
    ec2.account_id, 
    proc.name AS process_name, 
    proc.path AS process_path, 
    proc.cmdline, 
    sock.remote_address, 
    sock.remote_port, 
    sock.state, 
    CASE 
        WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN 
'High-Risk Port' 
        ELSE 'Standard Port' 
    END AS port_risk 
FROM 
    aws_ec2_instance_current ec2 
    -- Start with current EC2 instances 
JOIN 
    processes proc ON ec2.upt_asset_id = proc.upt_asset_id 
    -- Link instances to their running processes 
JOIN 
    process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id = 
sock.upt_asset_id 
    -- Link processes to their network connections using both pid and asset ID 
for accuracy 
WHERE 
    sock.remote_port NOT IN (80, 443, 22, 53)  
    -- Exclude common legitimate service ports (HTTP, HTTPS, SSH, DNS) 
    AND sock.state = 'ESTABLISHED' 
    -- Only look at active connections 
    AND sock.remote_address NOT LIKE '10.%' 
    AND sock.remote_address NOT LIKE '172.16.%' 
    AND sock.remote_address NOT LIKE '192.168.%' 
    AND sock.remote_address NOT LIKE '127.%' 
    AND sock.remote_address NOT LIKE '169.254.%' 
    -- Exclude connections to private IP ranges and localhost 
    AND proc.name NOT IN ('chrome', 'firefox', 'aws', 'aws-cli', 'aws.exe') 
    -- Exclude known legitimate processes 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's process data 
    AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's socket data 
 
UNION ALL 
 
SELECT 
    'GCP' AS cloud_provider, 
    gcp.name AS resource_id, 
    JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].networkIP') AS 
private_ip_address, 
    JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].accessConfigs[0].natIP') 
AS public_ip_address, 
    gcp.region, 
    gcp.project_id AS account_id, 
    proc.name AS process_name, 
    proc.path AS process_path, 
    proc.cmdline, 
    sock.remote_address, 
    sock.remote_port, 
    sock.state, 
    CASE 
        WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN 
'High-Risk Port' 
        ELSE 'Standard Port' 
    END AS port_risk 
FROM 
    gcp_compute_instance_current gcp 
    -- Start with current GCP instances 
JOIN 
    processes proc ON gcp.upt_asset_id = proc.upt_asset_id 
    -- Link instances to their running processes 
JOIN 
    process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id = 
sock.upt_asset_id 
    -- Link processes to their network connections 
WHERE 
    sock.remote_port NOT IN (80, 443, 22, 53) 
    -- Exclude common legitimate service ports 
    AND sock.state = 'ESTABLISHED' 
    -- Only look at active connections 
    AND sock.remote_address NOT LIKE '10.%' 
    AND sock.remote_address NOT LIKE '172.16.%' 
    AND sock.remote_address NOT LIKE '192.168.%' 
    AND sock.remote_address NOT LIKE '127.%' 
    AND sock.remote_address NOT LIKE '169.254.%' 
    -- Exclude connections to private IP ranges and localhost 
    AND proc.name NOT IN ('chrome', 'firefox', 'gcloud', 'gcloud.cmd') 
    -- Exclude known legitimate GCP-specific processes 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's process data 
    AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's socket data 
 
UNION ALL 
 
SELECT 
    'Azure' AS cloud_provider, 
    az.name AS resource_id, 
    JSON_EXTRACT_SCALAR(az.network_profile, 
'$.networkInterfaces[0].privateIPAddress') AS private_ip_address, 
    JSON_EXTRACT_SCALAR(az.network_profile, 
'$.networkInterfaces[0].publicIPAddress') AS public_ip_address, 
    az.region AS region, 
    az.subscription_id AS account_id, 
    proc.name AS process_name, 
    proc.path AS process_path, 
    proc.cmdline, 
    sock.remote_address, 
    sock.remote_port, 
    sock.state, 
    CASE 
        WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN 
'High-Risk Port' 
        ELSE 'Standard Port' 
    END AS port_risk 
FROM 
    azure_compute_vm_current az 
    -- Start with current Azure VMs 
JOIN 
    processes proc ON az.upt_asset_id = proc.upt_asset_id 
    -- Link VMs to their running processes 
JOIN 
    process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id = 
sock.upt_asset_id 
    -- Link processes to their network connections 
WHERE 
    sock.remote_port NOT IN (80, 443, 22, 53) 
    -- Exclude common legitimate service ports 
    AND sock.state = 'ESTABLISHED' 
    -- Only look at active connections 
    AND sock.remote_address NOT LIKE '10.%' 
    AND sock.remote_address NOT LIKE '172.16.%' 
    AND sock.remote_address NOT LIKE '192.168.%' 
    AND sock.remote_address NOT LIKE '127.%' 
    AND sock.remote_address NOT LIKE '169.254.%' 
    -- Exclude connections to private IP ranges and localhost 
    AND proc.name NOT IN ('chrome', 'firefox', 'az', 'azure-cli') 
    -- Exclude known legitimate Azure-specific processes 
    AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's process data 
    AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- Get current day's socket data 
ORDER BY 
    cloud_provider, 
    resource_id 

Suspicious File Execution and Privilege Escalation Detection 
This query detects potential privilege escalation attempts and suspicious process execution 
patterns across cloud environments. 
 
Detection Coverage: 
 
-  Suspicious file locations (/tmp, /dev/shm) 
-  Privilege escalation attempts 
-  Suspicious command patterns 
-  Fileless malware execution 
-  Shell code execution 
 
Key Components: 
 
1.  Process Event Monitoring 
2.  Cloud Asset Correlation 
3.  Command Pattern Analysis 
4.  Risk Level Assessment 
 
Performance Considerations: 
 
-  Uses _current tables for cloud assets 
-  Proper upt_day filtering on historical tables 
-  Efficient LEFT JOINs for cloud correlation 
 
SELECT 
    -- Resource and event identification 
    CASE 
        WHEN ec2.instance_id IS NOT NULL THEN CONCAT('AWS EC2: ', 
ec2.instance_id) 
        WHEN gcp.name IS NOT NULL THEN CONCAT('GCP: ', gcp.name) 
        WHEN az.name IS NOT NULL THEN CONCAT('Azure: ', az.name) 
        ELSE CONCAT('Host: ', pe.upt_hostname) 
    END AS resource_id, pe.upt_hostname AS hostname, 
    -- Timestamp information 
    pe.time AS event_time, date_format(from_unixtime(pe.time), '%Y-%m-%d 
%H:%i:%s') AS formatted_time, 
    -- Process details 
    pe.path AS process_path, pe.cmdline AS command_line, 
    pe.pid, pe.parent AS parent_pid, 
    -- User context 
    pe.uid AS user_id, pe.euid AS effective_user_id, 
    CASE WHEN pe.uid != pe.euid THEN 'Yes' ELSE 'No' END AS 
privilege_escalation, 
    -- Process context 
    pe.cwd AS current_directory, pe.ancestor_list, 
    CASE 
        WHEN pe.cmdline LIKE '%chmod%777%' OR  
             pe.cmdline LIKE '%chmod%+x%' OR  
             pe.cmdline LIKE '%chown%' OR  
             pe.cmdline LIKE '%sudo%' OR  
             pe.cmdline LIKE '%su %root%' OR 
             pe.cmdline LIKE '%/tmp/%' OR 
             pe.cmdline LIKE '%/dev/shm/%' OR 
             pe.cmdline LIKE '%nc %' OR 
             pe.cmdline LIKE '%netcat%' OR 
             pe.cmdline LIKE '%ncat%' OR 
             pe.cmdline LIKE '%curl%|%sh%' OR 
             pe.cmdline LIKE '%wget%|%sh%' OR 
             pe.cmdline LIKE '%base64%-d%' 
        THEN 'High' 
        WHEN pe.cmdline LIKE '%/var/tmp/%' OR 
             pe.cmdline LIKE '%eval%' OR 
             pe.cmdline LIKE '%`%`%' 
        THEN 'Medium' 
        ELSE 'Low' 
    END AS risk_level 
FROM 
    process_events pe 
    -- Base table: process_events contains all process creation events 
(historical table) 
    -- Using this instead of 'processes' to catch short-lived processes that 
might be missed 
LEFT JOIN 
    aws_ec2_instance_current ec2 ON pe.upt_asset_id = ec2.upt_asset_id 
    -- LEFT JOIN with _current table preserves all process events even if 
instance is terminated 
    -- No upt_day filter needed for _current table 
LEFT JOIN 
    gcp_compute_instance_current gcp ON pe.upt_asset_id = gcp.upt_asset_id 
    -- Similar LEFT JOIN pattern for GCP instances 
    -- Using _current tables ensures we get the latest instance metadata 
LEFT JOIN 
    azure_compute_vm_current az ON pe.upt_asset_id = az.upt_asset_id 
    -- Completing the cloud provider coverage with Azure VMs 
    -- LEFT JOINs allow detection of events even on non-cloud hosts 
WHERE 
    ( 
        -- Suspicious locations: Common paths used for malware and temporary 
files 
        pe.path LIKE '/tmp/%' OR 
        pe.path LIKE '/dev/shm/%' OR 
        pe.path LIKE '/var/tmp/%' OR 
         
        -- Privilege escalation: Detecting effective UID changes 
        -- Different UIDs indicate potential privilege escalation 
        pe.uid != pe.euid OR 
         
        -- Suspicious command patterns: Common system modification commands 
        pe.cmdline LIKE '%chmod%777%' OR  -- Full permission changes 
        pe.cmdline LIKE '%chmod%+x%' OR   -- Making files executable 
        pe.cmdline LIKE '%chown%' OR      -- Ownership changes 
        pe.cmdline LIKE '%sudo%' OR       -- Privilege escalation 
        pe.cmdline LIKE '%su %root%' OR   -- Direct root access attempts 
         
        -- Network tools: Often used in post-exploitation 
        pe.cmdline LIKE '%nc %' OR        -- Netcat variations 
        pe.cmdline LIKE '%netcat%' OR     -- Full netcat name 
        pe.cmdline LIKE '%ncat%' OR       -- Nmap's netcat 
         
        -- Fileless execution: Common fileless malware patterns 
        pe.cmdline LIKE '%curl%|%sh%' OR  -- Direct pipe to shell 
        pe.cmdline LIKE '%wget%|%sh%' OR  -- Another pipe to shell variant 
        pe.cmdline LIKE '%base64%-d%' OR  -- Encoded payload execution 
         
        -- Shell code: Dynamic code execution patterns 
        pe.cmdline LIKE '%eval%' OR       -- Direct command evaluation 
        pe.cmdline LIKE '%`%`%'           -- Backtick command substitution 
    ) 
    AND pe.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT) 
    -- CRITICAL: upt_day filter on process_events (historical table) 
    -- Using current day only to focus on recent suspicious activity 
    -- Without this filter, query would scan ALL historical data 
ORDER BY 
    pe.time DESC, 
    risk_level 

Multi-Cloud Lateral Movement Detection 
This query provides comprehensive detection of lateral movement attempts across multi-cloud 
environments by monitoring process-to-process communications and network connections 
between assets. 
 
Detection Coverage: 
 
-  Cross-cloud movement (AWS ↔ GCP ↔ Azure) 
-  Cross-account privilege escalation 
-  Cross-region traversal 
-  Process-based detection (SSH, RDP, PowerShell) 
-  Network-based detection (ports, protocols) 
 
Key Components: 
 
1.  Asset Registry 
 
-  Normalized view of cloud assets 
-  Unified IP and metadata handling 
-  Cross-cloud resource mapping 
 
2.  Process Monitoring 
 
-  Track suspicious processes 
-  Monitor privileged operations 
-  Analyze command patterns 
 
3.  Network Analysis 
 
-  Monitor established connections 
-  Track cross-boundary movement 
-  Identify unusual patterns 
 
4.  Risk Assessment 
 
-  Movement pattern analysis 
-  Privilege level evaluation 
-  Context-aware scoring 
 
Performance Optimization: 
 
-  Uses _current tables for cloud assets 
-  Proper upt_day filters on historical tables 
-  Efficient join strategies 
-  Indexed column usage 
-  Optimized filtering order 
 
WITH asset_registry AS ( 
    -- Build a unified view of all cloud assets across providers 
    -- This CTE normalizes instance metadata for correlation across AWS, GCP, 
and Azure 
     
    -- AWS Assets Section 
    SELECT 
        'AWS' AS cloud_provider, instance_id AS resource_id,                    
-- Provider and ID 
        private_ip_address AS internal_ip, public_ip_address AS external_ip,    
-- Network addresses 
        region, account_id, upt_asset_id,                                      
-- Asset context 
        instance_type AS resource_type, state_name AS state,                   
-- Instance details 
        JSON_EXTRACT_SCALAR(tags, '$.Name') AS name, 'EC2' AS service_type     
-- Metadata 
    FROM 
        aws_ec2_instance_current 
    WHERE 
        state_name = 'running' 
     
    UNION ALL 
     
    -- GCP Assets Section 
    SELECT 
        'GCP' AS cloud_provider, name AS resource_id,                          
-- Provider and ID 
        JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS 
internal_ip, 
        JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') 
AS external_ip,  -- Network 
        region, project_id AS account_id, upt_asset_id,                        
-- Asset context 
        machine_type AS resource_type, status AS state,                        
-- Instance details 
        JSON_EXTRACT_SCALAR(labels, '$.name') AS name, 'Compute' AS 
service_type  -- Metadata 
    FROM 
        gcp_compute_instance_current 
    WHERE 
        status = 'RUNNING' 
     
    UNION ALL 
     
    -- Azure Assets Section 
    SELECT 
        'Azure' AS cloud_provider, name AS resource_id,                         
-- Provider and ID 
        JSON_EXTRACT_SCALAR(network_profile, 
'$.networkInterfaces[0].privateIPAddress') AS internal_ip, 
        JSON_EXTRACT_SCALAR(network_profile, 
'$.networkInterfaces[0].publicIPAddress') AS external_ip,  -- Network 
        region, subscription_id AS account_id, upt_asset_id,        -- Asset 
context 
        JSON_EXTRACT_SCALAR(hardware_profile, '$.vmSize') AS resource_type, 
        provisioning_state AS state, name AS name, 'VM' AS service_type        
-- Instance details 
    FROM 
        azure_compute_vm_current 
    WHERE 
        provisioning_state = 'Succeeded' 
 
SELECT 
    -- Source Asset Information 
    src.cloud_provider AS source_cloud, src.resource_id AS source_id,      -- 
Cloud platform and resource 
    src.internal_ip AS source_ip, src.region AS source_region,            -- 
Network and location 
    src.account_id AS source_account, src.service_type AS source_service,  -- 
Account and service context 
    -- Destination Asset Information 
    dst.cloud_provider AS destination_cloud, dst.resource_id AS destination_id,    
-- Cloud and resource 
    dst.internal_ip AS destination_ip, dst.region AS destination_region,          
-- Network and location 
    dst.account_id AS destination_account, dst.service_type AS 
destination_service, -- Account and service 
    -- Movement Classification 
    -- Categorize the type of lateral movement attempt 
    -- Higher risk movements are listed first in the CASE statement 
    CASE 
        WHEN src.cloud_provider != dst.cloud_provider THEN 'Cross-Cloud'   -- 
Highest risk: moving between cloud providers 
        WHEN src.account_id != dst.account_id THEN 'Cross-Account'        -- 
High risk: moving between accounts 
        WHEN src.region != dst.region THEN 'Cross-Region'                 -- 
Medium risk: geographic movement 
        ELSE 'Same Environment'                                           -- 
Lower risk: local movement 
    END AS movement_type, 
    -- Process Information 
    p.name AS process_name, p.path AS process_path,         -- Process 
identification 
    p.cmdline AS command_line, p.uid AS user_id,           -- Execution context 
    pos.local_port, pos.remote_port, pos.state,            -- Connection 
details 
    -- Risk Level Assessment 
    -- Evaluate process names for known lateral movement tools and scripting 
languages 
    CASE 
        WHEN p.name IN ('ssh', 'scp', 'rdp', 'mstsc.exe', 'psexec.exe', 
'winrm', 'winrs', 'nc', 'netcat')  
            THEN 'High'      -- Direct remote access and known lateral movement 
tools 
        WHEN p.name IN ('powershell.exe', 'powershell', 'python', 'python3', 
'perl', 'ruby', 'bash', 'sh', 'cmd.exe')  
            THEN 'Medium'    -- Scripting languages that could be used 
maliciously 
        ELSE 'Low'          -- Other processes 
    END AS risk_level, 
    CASE 
        WHEN src.account_id != dst.account_id OR src.cloud_provider != 
dst.cloud_provider THEN 'Alert' 
        ELSE 'Info' 
    END AS alert_level, 
    p.start_time 
FROM 
    asset_registry src                           -- Start with source assets 
from our normalized view 
JOIN 
    processes p                                  -- Get process information 
        ON src.upt_asset_id = p.upt_asset_id    -- Link process to source asset 
JOIN 
    process_open_sockets pos                     -- Get network connection 
information 
        ON p.pid = pos.pid                       -- Link socket to specific 
process 
Unset
        AND p.upt_asset_id = pos.upt_asset_id   -- Ensure socket belongs to 
same asset 
JOIN 
    asset_registry dst                           -- Join to destination asset 
        ON pos.remote_address = dst.internal_ip  -- Match on private IP 
        OR pos.remote_address = dst.external_ip  -- Also check public IP for 
external connections 
WHERE 
    src.upt_asset_id != dst.upt_asset_id        -- Exclude self-connections 
    AND pos.state = 'ESTABLISHED'                -- Only look at active 
connections 
    -- Time-based filters for historical tables 
    AND p.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT)      -- Last 24h of process data 
    AND pos.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), 
'%Y%m%d') AS INT)    -- Last 24h of socket data 
ORDER BY 
    movement_type, 
    risk_level, 
    p.start_time DESC 

Performance Optimization Tips 
When working with Uptycs schema, follow these best practices to optimize query performance: 

1. Historical Table Management 
ALWAYS filter historical tables by upt_day 
 
-  This is not optional - queries will fail or retrieve excessive data without this filter 
-  The upt_day filter is a required field for all historical tables 
-  Missing this filter is the most common cause of query performance problems 
 
Example: 
 
WHERE 
    upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d') AS 
INT) 

2. Table Selection Strategy 
Use _current tables when possible 
 
-  Pre-filtered and provide better performance 
-  Only use historical tables when historical analysis is required 
-  Not all tables follow the "*_current" naming pattern 
 
Example: 
 
FROM 
    aws_ec2_instance_current 
    -- Better than using aws_ec2_instance for current state 

3. Join Optimization 
Filter early in joins 
 
-  Apply filters to tables before joining 
-  Use appropriate join types (LEFT vs INNER) 
-  Consider join order for large datasets 
 
Example: 
 
JOIN 
    processes p ON asset.upt_asset_id = p.upt_asset_id 
    -- Join on indexed columns 
WHERE 
    p.name LIKE '%suspicious%' 
    -- Apply process filters after join 

4. Data Filtering 
Use specific filters on high-cardinality columns 
 
-  Filter on instance_id, account_id, or region before joining 
-  Use equality operators when possible 
-  Leverage indexed columns in WHERE clauses 
 
Example: 
 
WHERE 
    account_id = '123456789012' 
    AND region = 'us-east-1' 

5. JSON Handling 
Optimize JSON operations 
 
-  Extract only needed JSON fields 
-  Apply filters to JSON fields in WHERE clauses 
-  Use JSON_EXTRACT_SCALAR for better performance 
 
Example: 
 
SELECT 
    JSON_EXTRACT_SCALAR(tags, '$.Name') AS instance_name 
    -- Better than extracting entire tags object 

Verified Table Information 
Based on the schema files, here are the correct table names and key columns for cloud 
resources: 

AWS EC2 Instances 
-  Table name: aws_ec2_instance_current 
-  Key columns: 
●  instance_id: The ID of the instance (VARCHAR) 
●  instance_type: The instance type (VARCHAR) 
●  private_ip_address: The private IPv4 address assigned to the instance 
(VARCHAR) 
●  public_ip_address: The public IPv4 address (VARCHAR) 
●  state_name: The current state of the instance (VARCHAR) 
 
    • Related Articles

    • OpenVPN query

      The following query should identify all assets with OpenVPN installed and last time it was used SELECT p.upt_hostname AS asset_name, MAX(p.upt_time) AS openvpn_last_accessed FROM socket_events se JOIN processes p on se.pid = p.pid WHERE se.pid > 0 ...
    • Query to find the number of Osquery restarts in a day for all hosts

      Following query, run on Global database, would provide details on number of osquery restarts in a day (GMT) Query SELECT upt_hostname, Count(0)  FROM processes  WHERE name LIKE 'osqueryd%' AND upt_added  AND upt_day = Cast(Date_format(current_date - ...
    • Windows Disk Usage query

      The article includes query to fetch the Windows Disk usage: Query SELCET DISTINCT                 upt_hostname,                 upt_time,                 device_id,                 size,                 free_space,                 size - free_space ...
    • Query to find Processes Running

      Processes running between certain times Query SELECT * FROM windowed_processes p WHERE p.upt_add_time >= <TIMESTAMP> AND p.upt_add_time <= <TIMESTAMP>; Example All processes running between 2018-07-18 2:00 to 2018-08-18 3:00 SELECT * FROM ...
    • High CPU Utilization Alert Query

      Query WITH cpu_incremental AS ( SELECT upt_asset_id, upt_hostname, upt_time, cpu_idle - LAG(cpu_idle) OVER (PARTITION BY upt_asset_id ORDER BY upt_time) AS cpu_idle, cpu_total - LAG(cpu_total) OVER (PARTITION BY upt_asset_id ORDER BY upt_time) AS ...