Global Query Tutorial

Understanding Uptycs Schema

Introduction

Welcome to this step-by-step tutorial on understanding and effectively using the Uptycs

Schema. This guide will help you navigate the complex data structure of Uptycs, create effective

queries, and unlock valuable insights from your cloud and on-premises environments.

Uptycs provides a unified view of your cloud and on-premises assets, allowing you to monitor,

analyze, and secure your infrastructure. To make the most of this platform, it's essential to

understand how data is organized and how to construct queries that deliver accurate and

meaningful results.

Schema Overview

Uptycs organizes data across multiple categories, each representing different cloud providers

and technologies:

- AWS resources: Cloud resources from Amazon Web Services

- Azure resources: Cloud resources from Microsoft Azure

- GCP resources: Cloud resources from Google Cloud Platform

- Kubernetes resources: Container orchestration resources

- Cross-platform resources: Tables that span multiple platforms (processes, packages,

etc.)

Table Types and Their Uses

The Uptycs schema contains different types of tables that serve various purposes.

Understanding these table types is crucial for constructing effective queries.

_current Tables vs. Historical Tables

Uptycs uses two primary table types: _current tables and historical tables (tables without the

_current suffix).

_current Tables

- Represent the most recent state snapshot

- Deduplicated, showing only the current state

- Default choice for most queries

- Used when you don't need historical data

- Example: aws_ec2_instance_current

Historical Tables

- Contain historical/time-series data

- Include a upt_day column (date format stored as integer, e.g., 20250301)

- May have multiple entries per object

- Used for tracking changes over time

- Required when you specify "historic" in your requirements

- Example: aws_ec2_instance (without "_current")

- Not all tables follow the "*_current" naming pattern (e.g., upt_assets table)

Example: Comparing _current vs. Historical Tables

Current state only (using _current table):

SELECT

instance_id,

instance_type,

private_ip_address,

public_ip_address,

state_name

FROM

aws_ec2_instance_current

WHERE

account_id = '123456789012'

-- No upt_day filter needed since _current tables only contain the latest

state

-- _current tables are already filtered and deduplicated

Historical view (using historical table):

SELECT

instance_id,

instance_type,

private_ip_address,

public_ip_address,

state_name,

upt_day

FROM

aws_ec2_instance

WHERE

Unset

account_id = '123456789012'

-- upt_day filter is REQUIRED for historical tables - NOT OPTIONAL

-- Queries on historical tables will fail or return excessive data without

this filter

AND upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

ORDER BY

instance_id,

upt_day

Understanding the upt_day Column

The upt_day column is a critical component of historical tables that enables historical analysis:

- Format: Integer representation of a date (YYYYMMDD)

- Example: March 1, 2025 would be represented as 20250301

- Used to filter data for specific time ranges

- REQUIRED for all queries on historical tables - queries will fail or produce excessive

data without this filter

- Critical for both correctness and performance

Best Practices for upt_day Filtering

Always include a upt_day filter when querying historical tables - this is not optional:

-- Last 7 days of data

WHERE upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d')

AS INT)

-- Current day only

WHERE upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d')

AS INT)

-- Specific date range

WHERE upt_day BETWEEN CAST('20250201' AS INT) AND CAST('20250228' AS INT)

Linking Assets with upt_asset_id

The upt_asset_id field is a crucial identifier that links compute assets with runtime

information:

- Uniquely identifies cloud resources across the Uptycs schema

- Enables joining compute resources with installed software, processes, and security

findings

- Only available for compute resources (EC2, Lambda, VMs, etc.)

- Not available for non-compute resources like VPCs, security groups, etc.

Working with Mixed Table Types

Sometimes you need to join _current tables with historical tables. Here's how to handle these

situations correctly:

Example: Joining _current with Historical Tables

SELECT

ec2.instance_id,

ec2.private_ip_address,

proc.name AS process_name,

proc.pid,

proc.cmdline

FROM

aws_ec2_instance_current ec2

-- Using _current table for EC2 instances (no upt_day needed)

JOIN

processes proc ON ec2.upt_asset_id = proc.upt_asset_id

-- Joining tables using upt_asset_id links instances with their processes

WHERE

proc.name LIKE '%nginx%'

-- Filtering for processes with "nginx" in the name

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- IMPORTANT: Apply upt_day filter only to historical tables (processes)

-- Without this filter, the query would process excessive historical data

Key points:

- Apply the upt_day filter only to the historical table (processes)

- The _current table (aws_ec2_instance_current) doesn't need a upt_day filter

Unset

- This approach joins current EC2 state with current process information

- Always include a comment explaining why the upt_day filter is used

Working with JSON Fields

Many cloud resources store complex data in JSON format, particularly in GCP and Azure.

Understanding how to extract and filter JSON data is essential for effective cloud resource

analysis.

Example: Extracting Tags from AWS Resources

SELECT

instance_id,

JSON_EXTRACT_SCALAR(tags, '$.Name') AS name_tag,

JSON_EXTRACT_SCALAR(tags, '$.Environment') AS environment_tag,

JSON_EXTRACT_SCALAR(tags, '$.Owner') AS owner_tag

FROM

aws_ec2_instance_current

WHERE

JSON_EXTRACT_SCALAR(tags, '$.Environment') = 'Production'

Example: Working with GCP Network Interfaces

SELECT

name AS instance_name,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS primary_ip,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].subnetwork') AS subnet,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') AS

public_ip

FROM

gcp_compute_instance_current

WHERE

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') IS

NOT NULL

Cross-Cloud Queries and Data Consistency

When querying across multiple cloud providers, you need to handle differences in schema

structure and data types:

Example: Finding Instances Across Multiple Clouds

-- AWS EC2 Instances

SELECT

'AWS' AS cloud_provider,

instance_id AS resource_id,

instance_type AS resource_type,

private_ip_address AS private_ip,

public_ip_address AS public_ip,

state_name AS state

FROM

aws_ec2_instance_current

-- Using _current tables to get latest state

UNION ALL

-- GCP VM Instances

SELECT

'GCP' AS cloud_provider,

name AS resource_id,

machine_type AS resource_type,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS private_ip,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP') AS

public_ip,

status AS state

FROM

gcp_compute_instance_current

-- Using _current tables to get latest state

UNION ALL

-- Azure VM Instances

SELECT

'Azure' AS cloud_provider,

name AS resource_id,

JSON_EXTRACT_SCALAR(hardware_profile, '$.vmSize') AS resource_type,

JSON_EXTRACT_SCALAR(network_profile,

'$.networkInterfaces[0].privateIPAddress') AS private_ip,

JSON_EXTRACT_SCALAR(network_profile,

'$.networkInterfaces[0].publicIPAddress') AS public_ip,

provisioning_state AS state

FROM

azure_compute_vm_current

-- Using _current tables to get latest state

Key points:

- Notice how we handle JSON fields in GCP and Azure using JSON extraction functions

- Column names differ across providers (instance_id vs. name)

- Data structure differs (scalar values in AWS vs. JSON arrays in GCP/Azure)

- Using common column aliases creates a consistent output format

Advanced Security and Compliance Queries

Let's explore advanced scenarios that combine different table types and demonstrate powerful

security and compliance use cases:

MySQL Installations Across Package Types and Platforms

This query provides a comprehensive view of MySQL installations across different cloud

providers (AWS, GCP, Azure) and package management systems (DEB, RPM), enabling

effective software inventory and compliance monitoring.

Detection Coverage:

- MySQL package installations

- Package versions and sources

- Cross-cloud deployment tracking

- Operating system variations

- Package management systems

Key Components:

1. Package Information

- Package names and versions

- Installation sources

- Package management type

2. Cloud Resource Context

- Instance identification

- Network information

- Cloud provider details

3. System Details

- Operating system versions

- Hostname information

- Asset correlation

Performance Optimization:

- Efficient UNION ALL operations

- Proper historical table filtering

- Optimized package name matching

- Strategic table joins

-- Debian Packages on AWS EC2

SELECT

'AWS' AS cloud_provider,

'DEB' AS package_type,

ec2.instance_id AS resource_id,

ec2.private_ip_address,

sys.hostname,

sys.upt_os_version AS os_version,

deb.name AS package_name,

deb.version AS package_version,

deb.source AS package_source

FROM

aws_ec2_instance_current ec2

JOIN

system_info sys ON ec2.upt_asset_id = sys.upt_asset_id

-- Using system_info for host information

JOIN

deb_packages deb ON ec2.upt_asset_id = deb.upt_asset_id

WHERE

deb.name LIKE '%mysql%'

-- Matching on MySQL packages (mysql-server, mysql-client, mysql-common,

etc.)

AND deb.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical deb_packages table

AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical system_info table

UNION ALL

-- RPM Packages on AWS EC2

SELECT

'AWS' AS cloud_provider,'RPM' AS package_type,

ec2.instance_id AS resource_id,

ec2.private_ip_address,

sys.hostname,

sys.upt_os_version AS os_version,

rpm.name AS package_name,

rpm.version AS package_version,

rpm.vendor AS package_source

FROM

aws_ec2_instance_current ec2

JOIN

system_info sys ON ec2.upt_asset_id = sys.upt_asset_id

-- Using system_info for host information

JOIN

rpm_packages rpm ON ec2.upt_asset_id = rpm.upt_asset_id

WHERE

rpm.name LIKE '%mysql%'

-- Matching on MySQL packages (mysql-server, mysql-client, mysql-libs,

etc.)

AND rpm.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical rpm_packages table

AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical system_info table

UNION ALL

-- Packages on GCP instances

SELECT

'GCP' AS cloud_provider,

pkg.package_type,

gcp.name AS resource_id,

JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].networkIP') AS

private_ip_address,

sys.hostname,

sys.upt_os_version AS os_version,

pkg.name AS package_name,

pkg.version AS package_version,

pkg.vendor AS package_source

FROM

gcp_compute_instance_current gcp

JOIN

system_info sys ON gcp.upt_asset_id = sys.upt_asset_id

-- Using system_info for host information

JOIN

-- Using a subquery to unify package sources

(

SELECT upt_asset_id, 'DEB' AS package_type, name, version, source AS

vendor, upt_day

FROM deb_packages

WHERE name LIKE '%mysql%'

UNION ALL

SELECT upt_asset_id, 'RPM' AS package_type, name, version, vendor,

upt_day

FROM rpm_packages

WHERE name LIKE '%mysql%'

) pkg ON gcp.upt_asset_id = pkg.upt_asset_id

WHERE

pkg.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical package data

AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical system_info table

UNION ALL

-- Packages on Azure VMs

SELECT

'Azure' AS cloud_provider,

pkg.package_type,

az.name AS resource_id,

JSON_EXTRACT_SCALAR(az.network_profile,

'$.networkInterfaces[0].privateIPAddress') AS private_ip_address,

sys.hostname,

sys.upt_os_version AS os_version,

pkg.name AS package_name,

pkg.version AS package_version,

pkg.vendor AS package_source

FROM

azure_compute_vm_current az

JOIN

system_info sys ON az.upt_asset_id = sys.upt_asset_id

-- Using system_info for host information

JOIN

-- Using a subquery to unify package sources

(

SELECT upt_asset_id, 'DEB' AS package_type, name, version, source AS

vendor, upt_day

FROM deb_packages

WHERE name LIKE '%mysql%'

UNION ALL

SELECT upt_asset_id, 'RPM' AS package_type, name, version, vendor,

upt_day

FROM rpm_packages

WHERE name LIKE '%mysql%'

) pkg ON az.upt_asset_id = pkg.upt_asset_id

WHERE

pkg.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical package data

AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for the historical system_info table

ORDER BY

cloud_provider,

resource_id,

package_name

Finding EC2 Instances with Vulnerable MySQL Packages

This query identifies EC2 instances running potentially vulnerable MySQL packages by

correlating instance data with vulnerability information, helping security teams prioritize patching

and remediation efforts.

Detection Coverage:

- MySQL package vulnerabilities

- Critical and high severity CVEs

- Package version tracking

- Cross-instance correlation

Key Components:

1. Vulnerability Analysis

- CVE identification

- Severity assessment

- Package version tracking

2. Package Context

- Package name verification

- Version information

- Installation details

3. Instance Correlation

- Resource identification

- Network context

- Asset tracking

Performance Optimization:

- Uses _current tables for EC2 instances

- Efficient vulnerability joins

- Proper historical table filtering

- Optimized severity filtering

SELECT

ec2.instance_id,ec2.private_ip_address,

vuln.cve_name, vuln.package_name, vuln.package_version,

vuln.advisory_severity, vuln.package_type

FROM

aws_ec2_instance_current ec2

-- Using _current table for EC2 to get current state of instances

JOIN

vulnerabilities vuln ON ec2.upt_asset_id = vuln.upt_asset_id

-- Joining on upt_asset_id links instances with their vulnerabilities

WHERE

vuln.package_name LIKE '%mysql%'

-- Focusing on MySQL-related vulnerabilities

AND vuln.advisory_severity IN ('critical', 'high')

-- Filtering for only high-severity vulnerabilities

AND vuln.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- IMPORTANT: Apply upt_day filter to vulnerabilities (historical table)

-- This filter is REQUIRED, not optional

ORDER BY

ec2.instance_id,

vuln.advisory_severity

-- Ordering by instance and severity for better readability

Finding Cloud Resources with Network Connections

This query provides comprehensive visibility into network connections across cloud resources,

helping identify active network communications and potential security concerns.

Detection Coverage:

- Active network connections

- Process-to-network mapping

- Cross-instance communication

- Protocol and port usage

Key Components:

1. Network Connection Analysis

- Socket state tracking

- Port monitoring

- Address resolution

2. Process Context

- Process identification

- Resource utilization

- Network activity correlation

3. Cloud Resource Correlation

- Instance metadata

- Network interface details

- Cross-cloud visibility

Performance Optimization:

- Uses _current tables for cloud assets

- Efficient process-socket joins

- Proper historical table filtering

- Optimized multi-table correlation

SELECT

ec2.instance_id,

ec2.private_ip_address,

proc.name AS process_name,

sockets.local_address,

sockets.local_port,

sockets.remote_address,

sockets.remote_port,

sockets.state

FROM

aws_ec2_instance_current ec2

-- Using _current table for EC2 instances (current state only)

JOIN

processes proc ON ec2.upt_asset_id = proc.upt_asset_id

-- Joining with processes using upt_asset_id

JOIN

process_open_sockets sockets ON proc.pid = sockets.pid AND

proc.upt_asset_id = sockets.upt_asset_id

-- Joining process_open_sockets with processes using both pid and

upt_asset_id

-- Double join condition ensures correct process-to-socket mapping

WHERE

sockets.remote_port = 443

-- Filtering for HTTPS connections (port 443)

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Using current day filter for processes (historical table)

AND sockets.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Using current day filter for process_open_sockets (historical table)

-- IMPORTANT: Each historical table needs its own upt_day filter

-- _current tables (ec2) don't need upt_day filters

Finding Instances with Exposed Sensitive Ports

This query identifies cloud instances that have sensitive ports exposed to the internet through a

combination of listening services and permissive security group rules, helping teams identify

potential security vulnerabilities.

Detection Coverage:

- Sensitive port exposure (SSH, RDP, databases)

- Public IP accessibility

- Security group misconfigurations

- Process-port relationships

Key Components:

1. Port Analysis

- Service identification

- Protocol verification

- Binding interface checks

2. Security Group Evaluation

- Rule analysis

- CIDR range verification

- Protocol matching

3. Process Context

- Service identification

- Path verification

- Resource correlation

4. Network Exposure

- Public IP detection

- Interface binding analysis

- Port range validation

Performance Optimization:

- Uses _current tables for cloud resources

- Efficient security group joins

- Proper historical table filtering

- Optimized port range checks

SELECT

ec2.instance_id,

ec2.private_ip_address,

ec2.public_ip_address,

ec2.account_id,

ec2.region,

listen.port,

listen.protocol,

proc.name AS process_name,

proc.path AS process_path,

sg.security_group_name AS security_group

FROM

aws_ec2_instance_current ec2

-- Using _current table for EC2 instances

JOIN

listening_ports listen ON ec2.upt_asset_id = listen.upt_asset_id

-- Joining with listening ports information

JOIN

processes proc ON listen.pid = proc.pid

AND listen.upt_asset_id = proc.upt_asset_id

-- Joining with process information

JOIN

aws_ec2_security_group_current sg ON ec2.vpc_id = sg.vpc_id

-- Joining with security group information

JOIN

aws_ec2_security_group_rule_current rule ON sg.security_group_id =

rule.group_id

-- Joining with security group rules

WHERE

listen.port IN (22, 3389, 1433, 3306, 5432, 27017, 6379, 9200, 8080, 8443)

-- List of potentially sensitive ports

AND (

rule.cidr_ipv4 = '0.0.0.0/0' OR

rule.cidr_ipv6 = '::/0'

)

-- Looking for rules that allow access from anywhere

AND (

(rule.from_port <= listen.port AND rule.to_port >= listen.port) OR

rule.from_port = -1 -- All ports

)

-- Ensuring the port is within the allowed range

AND listen.address = '0.0.0.0'

-- Looking for services binding to all interfaces

AND ec2.public_ip_address IS NOT NULL

-- Only including instances with public IPs

AND listen.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Current day filter for listening_ports - REQUIRED

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Current day filter for processes - REQUIRED

ORDER BY

ec2.instance_id,

listen.port

Detecting Unusual Process Activity Across Cloud Resources

This query monitors and identifies potentially malicious or unusual process activity across cloud

environments by detecting suspicious process patterns, command execution, and high-risk

behaviors.

Detection Coverage:

- Network utility processes

- Reverse shell attempts

- Base64 encoded execution

- Suspicious file locations

- Cross-cloud process activity

Key Components:

1. Process Analysis

- Name and path verification

- Command line inspection

- Execution context tracking

2. Cloud Resource Context

- Provider identification

- Instance correlation

- System information

3. Security Pattern Detection

- Network tool usage

- Encoded command execution

Unset

- Temporary file execution

- Privilege context

Performance Optimization:

- Efficient cloud resource joins

- Proper historical table filtering

- Multi-provider correlation

- Optimized process tracking

SELECT

sys.hostname,

CASE

WHEN ec2.instance_id IS NOT NULL THEN 'AWS'

WHEN gcp.name IS NOT NULL THEN 'GCP'

WHEN az.name IS NOT NULL THEN 'Azure'

ELSE 'Other'

END AS cloud_provider,

COALESCE(ec2.instance_id, gcp.name, az.name) AS resource_id,

proc.name AS process_name,

proc.path,

proc.cmdline,

proc.start_time,

proc.uid,

proc.gid

FROM

processes proc

-- Using processes table (historical)

JOIN

system_info sys ON proc.upt_asset_id = sys.upt_asset_id

-- Joining with system information

LEFT JOIN

aws_ec2_instance_current ec2 ON proc.upt_asset_id = ec2.upt_asset_id

-- Optional join with AWS resources

LEFT JOIN

gcp_compute_instance_current gcp ON proc.upt_asset_id = gcp.upt_asset_id

-- Optional join with GCP resources

LEFT JOIN

azure_compute_vm_current az ON proc.upt_asset_id = az.upt_asset_id

-- Optional join with Azure resources

WHERE

(

proc.name IN ('nc', 'netcat', 'ncat', 'socat', 'nmap')

-- Looking for network utility processes

OR proc.cmdline LIKE '%reverse-shell%'

-- Looking for potential reverse shells

OR proc.cmdline LIKE '%base64 -d%'

-- Looking for base64 decoded execution

OR proc.path LIKE '/tmp/%'

-- Processes running from temporary directories

)

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Looking back 1 day for suspicious processes

-- REQUIRED upt_day filter for the historical table

AND sys.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- REQUIRED upt_day filter for system_info (historical table)

-- Each historical table must have its own upt_day filter

ORDER BY

proc.start_time DESC

-- Most recent processes first

Identifying Suspicious Outbound Network Connections

This query detects potentially malicious outbound network connections from cloud instances by

monitoring process-level network activity and identifying connections to known suspicious ports

or IP ranges.

Detection Coverage:

- Suspicious port usage (4444, 31337, etc.)

- Known malicious IP ranges

- Unusual process-to-network patterns

- Cross-region communications

Key Components:

1. Network Connection Analysis

- Port monitoring

- IP range filtering

- Connection state tracking

2. Process Context

Unset

- Process identification

- Command line analysis

- Path verification

3. Cloud Asset Correlation

- Instance metadata

- Account context

- Regional information

Performance Optimization:

- Uses _current tables for cloud assets

- Proper upt_day filtering

- Efficient join strategies

SELECT

ec2.instance_id, ec2.private_ip_address, ec2.account_id, ec2.region,

proc.name AS process_name, proc.path AS process_path, proc.cmdline,

conn.local_address,conn.local_port,conn.remote_address, conn.remote_port,

conn.state

FROM

aws_ec2_instance_current ec2

-- Using _current table for EC2 instances

JOIN

process_open_sockets conn ON ec2.upt_asset_id = conn.upt_asset_id

-- Joining with open network connections

JOIN

processes proc ON conn.pid = proc.pid AND conn.upt_asset_id =

proc.upt_asset_id

-- Joining with process information

WHERE

(

conn.remote_port IN (4444, 31337, 8081, 8082, 8083, 8084, 8085)

-- Potentially suspicious ports

OR conn.remote_address LIKE '185.%'

-- Specific suspicious IP ranges

)

-- Looking for connections to suspicious destinations

AND conn.state = 'ESTABLISHED'

-- Looking for established connections

AND NOT (

-- Excluding known good addresses/services

conn.remote_address LIKE '23.21.227.%' OR -- AWS IPs

conn.remote_address LIKE '13.32.%' OR -- CloudFront

conn.remote_address LIKE '13.33.%' OR -- CloudFront

conn.remote_address LIKE '52.84.%' OR -- CloudFront

conn.remote_address LIKE '52.85.%' -- CloudFront

)

AND proc.name NOT IN ('chrome', 'firefox', 'aws', 'gcloud', 'az')

-- Excluding common user applications

AND conn.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Current day filter for connections - REQUIRED

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Current day filter for processes - REQUIRED

ORDER BY

ec2.instance_id,

proc.start_time DESC

Detecting Possible Command and Control (C2) Communications

This query identifies potential command and control (C2) traffic by analyzing network

connections from cloud instances for patterns commonly associated with malicious command

and control channels.

Detection Coverage:

- Suspicious network connections

- High-risk ports and protocols

- Known C2 patterns and behaviors

- Unusual process-to-network relationships

- Cross-cloud C2 detection

Key Components:

1. Network Analysis

- Connection state monitoring

- Port categorization

- Protocol verification

2. Process Context

- Binary verification

- Command-line analysis

Unset

- Process lineage

3. Risk Scoring

- Port risk assessment

- Destination reputation

- Connection patterns

Performance Optimization:

- Efficient network connection joins

- Process context correlation

- Proper day filtering

- Targeted exclusion lists

SELECT

'AWS' AS cloud_provider,

ec2.instance_id AS resource_id,

ec2.private_ip_address,

ec2.public_ip_address,

ec2.region,

ec2.account_id,

proc.name AS process_name,

proc.path AS process_path,

proc.cmdline,

sock.remote_address,

sock.remote_port,

sock.state,

CASE

WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN

'High-Risk Port'

ELSE 'Standard Port'

END AS port_risk

FROM

aws_ec2_instance_current ec2

-- Start with current EC2 instances

JOIN

processes proc ON ec2.upt_asset_id = proc.upt_asset_id

-- Link instances to their running processes

JOIN

process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id =

sock.upt_asset_id

-- Link processes to their network connections using both pid and asset ID

for accuracy

WHERE

sock.remote_port NOT IN (80, 443, 22, 53)

-- Exclude common legitimate service ports (HTTP, HTTPS, SSH, DNS)

AND sock.state = 'ESTABLISHED'

-- Only look at active connections

AND sock.remote_address NOT LIKE '10.%'

AND sock.remote_address NOT LIKE '172.16.%'

AND sock.remote_address NOT LIKE '192.168.%'

AND sock.remote_address NOT LIKE '127.%'

AND sock.remote_address NOT LIKE '169.254.%'

-- Exclude connections to private IP ranges and localhost

AND proc.name NOT IN ('chrome', 'firefox', 'aws', 'aws-cli', 'aws.exe')

-- Exclude known legitimate processes

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's process data

AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's socket data

UNION ALL

SELECT

'GCP' AS cloud_provider,

gcp.name AS resource_id,

JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].networkIP') AS

private_ip_address,

JSON_EXTRACT_SCALAR(gcp.network_interfaces, '$[0].accessConfigs[0].natIP')

AS public_ip_address,

gcp.region,

gcp.project_id AS account_id,

proc.name AS process_name,

proc.path AS process_path,

proc.cmdline,

sock.remote_address,

sock.remote_port,

sock.state,

CASE

WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN

'High-Risk Port'

ELSE 'Standard Port'

END AS port_risk

FROM

gcp_compute_instance_current gcp

-- Start with current GCP instances

JOIN

processes proc ON gcp.upt_asset_id = proc.upt_asset_id

-- Link instances to their running processes

JOIN

process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id =

sock.upt_asset_id

-- Link processes to their network connections

WHERE

sock.remote_port NOT IN (80, 443, 22, 53)

-- Exclude common legitimate service ports

AND sock.state = 'ESTABLISHED'

-- Only look at active connections

AND sock.remote_address NOT LIKE '10.%'

AND sock.remote_address NOT LIKE '172.16.%'

AND sock.remote_address NOT LIKE '192.168.%'

AND sock.remote_address NOT LIKE '127.%'

AND sock.remote_address NOT LIKE '169.254.%'

-- Exclude connections to private IP ranges and localhost

AND proc.name NOT IN ('chrome', 'firefox', 'gcloud', 'gcloud.cmd')

-- Exclude known legitimate GCP-specific processes

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's process data

AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's socket data

UNION ALL

SELECT

'Azure' AS cloud_provider,

az.name AS resource_id,

JSON_EXTRACT_SCALAR(az.network_profile,

'$.networkInterfaces[0].privateIPAddress') AS private_ip_address,

JSON_EXTRACT_SCALAR(az.network_profile,

'$.networkInterfaces[0].publicIPAddress') AS public_ip_address,

az.region AS region,

az.subscription_id AS account_id,

proc.name AS process_name,

proc.path AS process_path,

proc.cmdline,

sock.remote_address,

sock.remote_port,

sock.state,

CASE

WHEN sock.remote_port IN (4444, 8080, 8083, 1080, 3389, 8090) THEN

'High-Risk Port'

ELSE 'Standard Port'

END AS port_risk

FROM

azure_compute_vm_current az

-- Start with current Azure VMs

JOIN

processes proc ON az.upt_asset_id = proc.upt_asset_id

-- Link VMs to their running processes

JOIN

process_open_sockets sock ON proc.pid = sock.pid AND proc.upt_asset_id =

sock.upt_asset_id

-- Link processes to their network connections

WHERE

sock.remote_port NOT IN (80, 443, 22, 53)

-- Exclude common legitimate service ports

AND sock.state = 'ESTABLISHED'

-- Only look at active connections

AND sock.remote_address NOT LIKE '10.%'

AND sock.remote_address NOT LIKE '172.16.%'

AND sock.remote_address NOT LIKE '192.168.%'

AND sock.remote_address NOT LIKE '127.%'

AND sock.remote_address NOT LIKE '169.254.%'

-- Exclude connections to private IP ranges and localhost

AND proc.name NOT IN ('chrome', 'firefox', 'az', 'azure-cli')

-- Exclude known legitimate Azure-specific processes

AND proc.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's process data

AND sock.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- Get current day's socket data

ORDER BY

cloud_provider,

resource_id

Suspicious File Execution and Privilege Escalation Detection

This query detects potential privilege escalation attempts and suspicious process execution

patterns across cloud environments.

Detection Coverage:

- Suspicious file locations (/tmp, /dev/shm)

- Privilege escalation attempts

- Suspicious command patterns

- Fileless malware execution

- Shell code execution

Key Components:

1. Process Event Monitoring

2. Cloud Asset Correlation

3. Command Pattern Analysis

4. Risk Level Assessment

Performance Considerations:

- Uses _current tables for cloud assets

- Proper upt_day filtering on historical tables

- Efficient LEFT JOINs for cloud correlation

SELECT

-- Resource and event identification

CASE

WHEN ec2.instance_id IS NOT NULL THEN CONCAT('AWS EC2: ',

ec2.instance_id)

WHEN gcp.name IS NOT NULL THEN CONCAT('GCP: ', gcp.name)

WHEN az.name IS NOT NULL THEN CONCAT('Azure: ', az.name)

ELSE CONCAT('Host: ', pe.upt_hostname)

END AS resource_id, pe.upt_hostname AS hostname,

-- Timestamp information

pe.time AS event_time, date_format(from_unixtime(pe.time), '%Y-%m-%d

%H:%i:%s') AS formatted_time,

-- Process details

pe.path AS process_path, pe.cmdline AS command_line,

pe.pid, pe.parent AS parent_pid,

-- User context

pe.uid AS user_id, pe.euid AS effective_user_id,

CASE WHEN pe.uid != pe.euid THEN 'Yes' ELSE 'No' END AS

privilege_escalation,

-- Process context

pe.cwd AS current_directory, pe.ancestor_list,

CASE

WHEN pe.cmdline LIKE '%chmod%777%' OR

pe.cmdline LIKE '%chmod%+x%' OR

pe.cmdline LIKE '%chown%' OR

pe.cmdline LIKE '%sudo%' OR

pe.cmdline LIKE '%su %root%' OR

pe.cmdline LIKE '%/tmp/%' OR

pe.cmdline LIKE '%/dev/shm/%' OR

pe.cmdline LIKE '%nc %' OR

pe.cmdline LIKE '%netcat%' OR

pe.cmdline LIKE '%ncat%' OR

pe.cmdline LIKE '%curl%|%sh%' OR

pe.cmdline LIKE '%wget%|%sh%' OR

pe.cmdline LIKE '%base64%-d%'

THEN 'High'

WHEN pe.cmdline LIKE '%/var/tmp/%' OR

pe.cmdline LIKE '%eval%' OR

pe.cmdline LIKE '%`%`%'

THEN 'Medium'

ELSE 'Low'

END AS risk_level

FROM

process_events pe

-- Base table: process_events contains all process creation events

(historical table)

-- Using this instead of 'processes' to catch short-lived processes that

might be missed

LEFT JOIN

aws_ec2_instance_current ec2 ON pe.upt_asset_id = ec2.upt_asset_id

-- LEFT JOIN with _current table preserves all process events even if

instance is terminated

-- No upt_day filter needed for _current table

LEFT JOIN

gcp_compute_instance_current gcp ON pe.upt_asset_id = gcp.upt_asset_id

-- Similar LEFT JOIN pattern for GCP instances

-- Using _current tables ensures we get the latest instance metadata

LEFT JOIN

azure_compute_vm_current az ON pe.upt_asset_id = az.upt_asset_id

-- Completing the cloud provider coverage with Azure VMs

-- LEFT JOINs allow detection of events even on non-cloud hosts

WHERE

(

-- Suspicious locations: Common paths used for malware and temporary

files

pe.path LIKE '/tmp/%' OR

pe.path LIKE '/dev/shm/%' OR

pe.path LIKE '/var/tmp/%' OR

-- Privilege escalation: Detecting effective UID changes

-- Different UIDs indicate potential privilege escalation

pe.uid != pe.euid OR

-- Suspicious command patterns: Common system modification commands

pe.cmdline LIKE '%chmod%777%' OR -- Full permission changes

pe.cmdline LIKE '%chmod%+x%' OR -- Making files executable

pe.cmdline LIKE '%chown%' OR -- Ownership changes

pe.cmdline LIKE '%sudo%' OR -- Privilege escalation

pe.cmdline LIKE '%su %root%' OR -- Direct root access attempts

-- Network tools: Often used in post-exploitation

pe.cmdline LIKE '%nc %' OR -- Netcat variations

pe.cmdline LIKE '%netcat%' OR -- Full netcat name

pe.cmdline LIKE '%ncat%' OR -- Nmap's netcat

-- Fileless execution: Common fileless malware patterns

pe.cmdline LIKE '%curl%|%sh%' OR -- Direct pipe to shell

pe.cmdline LIKE '%wget%|%sh%' OR -- Another pipe to shell variant

pe.cmdline LIKE '%base64%-d%' OR -- Encoded payload execution

-- Shell code: Dynamic code execution patterns

pe.cmdline LIKE '%eval%' OR -- Direct command evaluation

pe.cmdline LIKE '%`%`%' -- Backtick command substitution

)

AND pe.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT)

-- CRITICAL: upt_day filter on process_events (historical table)

-- Using current day only to focus on recent suspicious activity

-- Without this filter, query would scan ALL historical data

ORDER BY

pe.time DESC,

risk_level

Multi-Cloud Lateral Movement Detection

This query provides comprehensive detection of lateral movement attempts across multi-cloud

environments by monitoring process-to-process communications and network connections

between assets.

Detection Coverage:

- Cross-cloud movement (AWS ↔ GCP ↔ Azure)

- Cross-account privilege escalation

- Cross-region traversal

- Process-based detection (SSH, RDP, PowerShell)

- Network-based detection (ports, protocols)

Key Components:

1. Asset Registry

- Normalized view of cloud assets

- Unified IP and metadata handling

- Cross-cloud resource mapping

2. Process Monitoring

- Track suspicious processes

- Monitor privileged operations

- Analyze command patterns

3. Network Analysis

- Monitor established connections

- Track cross-boundary movement

- Identify unusual patterns

4. Risk Assessment

- Movement pattern analysis

- Privilege level evaluation

- Context-aware scoring

Performance Optimization:

- Uses _current tables for cloud assets

- Proper upt_day filters on historical tables

- Efficient join strategies

- Indexed column usage

- Optimized filtering order

WITH asset_registry AS (

-- Build a unified view of all cloud assets across providers

-- This CTE normalizes instance metadata for correlation across AWS, GCP,

and Azure

-- AWS Assets Section

SELECT

'AWS' AS cloud_provider, instance_id AS resource_id,

-- Provider and ID

private_ip_address AS internal_ip, public_ip_address AS external_ip,

-- Network addresses

region, account_id, upt_asset_id,

-- Asset context

instance_type AS resource_type, state_name AS state,

-- Instance details

JSON_EXTRACT_SCALAR(tags, '$.Name') AS name, 'EC2' AS service_type

-- Metadata

FROM

aws_ec2_instance_current

WHERE

state_name = 'running'

UNION ALL

-- GCP Assets Section

SELECT

'GCP' AS cloud_provider, name AS resource_id,

-- Provider and ID

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].networkIP') AS

internal_ip,

JSON_EXTRACT_SCALAR(network_interfaces, '$[0].accessConfigs[0].natIP')

AS external_ip, -- Network

region, project_id AS account_id, upt_asset_id,

-- Asset context

machine_type AS resource_type, status AS state,

-- Instance details

JSON_EXTRACT_SCALAR(labels, '$.name') AS name, 'Compute' AS

service_type -- Metadata

FROM

gcp_compute_instance_current

WHERE

status = 'RUNNING'

UNION ALL

-- Azure Assets Section

SELECT

'Azure' AS cloud_provider, name AS resource_id,

-- Provider and ID

JSON_EXTRACT_SCALAR(network_profile,

'$.networkInterfaces[0].privateIPAddress') AS internal_ip,

JSON_EXTRACT_SCALAR(network_profile,

'$.networkInterfaces[0].publicIPAddress') AS external_ip, -- Network

region, subscription_id AS account_id, upt_asset_id, -- Asset

context

JSON_EXTRACT_SCALAR(hardware_profile, '$.vmSize') AS resource_type,

provisioning_state AS state, name AS name, 'VM' AS service_type

-- Instance details

FROM

azure_compute_vm_current

WHERE

provisioning_state = 'Succeeded'

)

SELECT

-- Source Asset Information

src.cloud_provider AS source_cloud, src.resource_id AS source_id, --

Cloud platform and resource

src.internal_ip AS source_ip, src.region AS source_region, --

Network and location

src.account_id AS source_account, src.service_type AS source_service, --

Account and service context

-- Destination Asset Information

dst.cloud_provider AS destination_cloud, dst.resource_id AS destination_id,

-- Cloud and resource

dst.internal_ip AS destination_ip, dst.region AS destination_region,

-- Network and location

dst.account_id AS destination_account, dst.service_type AS

destination_service, -- Account and service

-- Movement Classification

-- Categorize the type of lateral movement attempt

-- Higher risk movements are listed first in the CASE statement

CASE

WHEN src.cloud_provider != dst.cloud_provider THEN 'Cross-Cloud' --

Highest risk: moving between cloud providers

WHEN src.account_id != dst.account_id THEN 'Cross-Account' --

High risk: moving between accounts

WHEN src.region != dst.region THEN 'Cross-Region' --

Medium risk: geographic movement

ELSE 'Same Environment' --

Lower risk: local movement

END AS movement_type,

-- Process Information

p.name AS process_name, p.path AS process_path, -- Process

identification

p.cmdline AS command_line, p.uid AS user_id, -- Execution context

pos.local_port, pos.remote_port, pos.state, -- Connection

details

-- Risk Level Assessment

-- Evaluate process names for known lateral movement tools and scripting

languages

CASE

WHEN p.name IN ('ssh', 'scp', 'rdp', 'mstsc.exe', 'psexec.exe',

'winrm', 'winrs', 'nc', 'netcat')

THEN 'High' -- Direct remote access and known lateral movement

tools

WHEN p.name IN ('powershell.exe', 'powershell', 'python', 'python3',

'perl', 'ruby', 'bash', 'sh', 'cmd.exe')

THEN 'Medium' -- Scripting languages that could be used

maliciously

ELSE 'Low' -- Other processes

END AS risk_level,

CASE

WHEN src.account_id != dst.account_id OR src.cloud_provider !=

dst.cloud_provider THEN 'Alert'

ELSE 'Info'

END AS alert_level,

p.start_time

FROM

asset_registry src -- Start with source assets

from our normalized view

JOIN

processes p -- Get process information

ON src.upt_asset_id = p.upt_asset_id -- Link process to source asset

JOIN

process_open_sockets pos -- Get network connection

information

ON p.pid = pos.pid -- Link socket to specific

process

Unset

AND p.upt_asset_id = pos.upt_asset_id -- Ensure socket belongs to

same asset

JOIN

asset_registry dst -- Join to destination asset

ON pos.remote_address = dst.internal_ip -- Match on private IP

OR pos.remote_address = dst.external_ip -- Also check public IP for

external connections

WHERE

src.upt_asset_id != dst.upt_asset_id -- Exclude self-connections

AND pos.state = 'ESTABLISHED' -- Only look at active

connections

-- Time-based filters for historical tables

AND p.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT) -- Last 24h of process data

AND pos.upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY),

'%Y%m%d') AS INT) -- Last 24h of socket data

ORDER BY

movement_type,

risk_level,

p.start_time DESC

Performance Optimization Tips

When working with Uptycs schema, follow these best practices to optimize query performance:

1. Historical Table Management

ALWAYS filter historical tables by upt_day

- This is not optional - queries will fail or retrieve excessive data without this filter

- The upt_day filter is a required field for all historical tables

- Missing this filter is the most common cause of query performance problems

Example:

WHERE

upt_day >= CAST(date_format((CURRENT_DATE - INTERVAL '0' DAY), '%Y%m%d') AS

INT)

2. Table Selection Strategy

Use _current tables when possible

- Pre-filtered and provide better performance

- Only use historical tables when historical analysis is required

- Not all tables follow the "*_current" naming pattern

Example:

FROM

aws_ec2_instance_current

-- Better than using aws_ec2_instance for current state

3. Join Optimization

Filter early in joins

- Apply filters to tables before joining

- Use appropriate join types (LEFT vs INNER)

- Consider join order for large datasets

Example:

JOIN

processes p ON asset.upt_asset_id = p.upt_asset_id

-- Join on indexed columns

WHERE

p.name LIKE '%suspicious%'

-- Apply process filters after join

4. Data Filtering

Use specific filters on high-cardinality columns

- Filter on instance_id, account_id, or region before joining

- Use equality operators when possible

- Leverage indexed columns in WHERE clauses

Example:

WHERE

account_id = '123456789012'

AND region = 'us-east-1'

5. JSON Handling

Optimize JSON operations

- Extract only needed JSON fields

- Apply filters to JSON fields in WHERE clauses

- Use JSON_EXTRACT_SCALAR for better performance

Example:

SELECT

JSON_EXTRACT_SCALAR(tags, '$.Name') AS instance_name

-- Better than extracting entire tags object

Verified Table Information

Based on the schema files, here are the correct table names and key columns for cloud

resources:

AWS EC2 Instances

- Table name: aws_ec2_instance_current

- Key columns:

● instance_id: The ID of the instance (VARCHAR)

● instance_type: The instance type (VARCHAR)

● private_ip_address: The private IPv4 address assigned to the instance

(VARCHAR)

● public_ip_address: The public IPv4 address (VARCHAR)

● state_name: The current state of the instance (VARCHAR)

Related Articles
OpenVPN query
The following query should identify all assets with OpenVPN installed and last time it was used SELECT p.upt_hostname AS asset_name, MAX(p.upt_time) AS openvpn_last_accessed FROM socket_events se JOIN processes p on se.pid = p.pid WHERE se.pid > 0 ...
Query to find the number of Osquery restarts in a day for all hosts
Following query, run on Global database, would provide details on number of osquery restarts in a day (GMT) Query SELECT upt_hostname, Count(0) FROM processes WHERE name LIKE 'osqueryd%' AND upt_added AND upt_day = Cast(Date_format(current_date - ...
Windows Disk Usage query
The article includes query to fetch the Windows Disk usage: Query SELCET DISTINCT upt_hostname, upt_time, device_id, size, free_space, size - free_space ...
Query to find Processes Running
Processes running between certain times Query SELECT * FROM windowed_processes p WHERE p.upt_add_time >= <TIMESTAMP> AND p.upt_add_time <= <TIMESTAMP>; Example All processes running between 2018-07-18 2:00 to 2018-08-18 3:00 SELECT * FROM ...
High CPU Utilization Alert Query
Query WITH cpu_incremental AS ( SELECT upt_asset_id, upt_hostname, upt_time, cpu_idle - LAG(cpu_idle) OVER (PARTITION BY upt_asset_id ORDER BY upt_time) AS cpu_idle, cpu_total - LAG(cpu_total) OVER (PARTITION BY upt_asset_id ORDER BY upt_time) AS ...

Global Query Tutorial

Global Query Tutorial

Related Articles

OpenVPN query

Query to find the number of Osquery restarts in a day for all hosts

Windows Disk Usage query

Query to find Processes Running

High CPU Utilization Alert Query