This document outlines the plan for adding SQL querying support to SeaweedFS, focusing on reading and analyzing data from Message Queue (MQ) topics.
1. Goal
To provide a SQL querying interface for SeaweedFS, enabling analytics on existing MQ topics. This enables:
2. Key Features
SHOW DATABASES - List all MQ namespacesSHOW TABLES - List all topics in a namespaceDESCRIBE table_name - Show topic schema detailsSELECT support with WHERE, LIMIT, OFFSETCOUNT(), SUM(), AVG(), MIN(), MAX()weed sql with interactive shell mode3. Data Source Integration
weed/mq/logstore/read_parquet_to_log.goweed/mq/schema/schema.go for SQL metadata operations4. API & CLI Integration
weed sql command with interactive shell mode (similar to weed shell)Scenario 1: Schema Discovery and Metadata
-- List all namespaces (databases)
SHOW DATABASES;
-- List topics in a namespace
USE my_namespace;
SHOW TABLES;
-- View topic structure and discovered schema
DESCRIBE user_events;
Scenario 2: Data Querying
-- Basic filtering and projection
SELECT user_id, event_type, timestamp
FROM user_events
WHERE timestamp > 1640995200000
LIMIT 100;
-- Aggregation queries
SELECT COUNT(*) as event_count
FROM user_events
WHERE timestamp >= 1640995200000;
-- More aggregation examples
SELECT MAX(timestamp), MIN(timestamp)
FROM user_events;
Scenario 3: Analytics & Monitoring
-- Basic analytics
SELECT COUNT(*) as total_events
FROM user_events
WHERE timestamp >= 1640995200000;
-- Simple monitoring
SELECT AVG(response_time) as avg_response
FROM api_logs
WHERE timestamp >= 1640995200000;
## Architecture Overview
SQL Query Flow:
1. Parse SQL 2. Plan & Optimize 3. Execute Query
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌──────────────┐ │ Client │ │ SQL Parser │ │ Query Planner │ │ Execution │ │ (CLI) │──→ │ PostgreSQL │──→ │ & Optimizer │──→ │ Engine │ │ │ │ (Custom) │ │ │ │ │ └─────────────┘ └──────────────┘ └─────────────────┘ └──────────────┘
│ │
│ Schema Lookup │ Data Access
▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Schema Catalog │
│ • Namespace → Database mapping │
│ • Topic → Table mapping │
│ • Schema version management │
└─────────────────────────────────────────────────────────────┘
▲
│ Metadata
│
┌─────────────────────────────────────────────────────────────────────────────┐ │ MQ Storage Layer │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ▲ │ │ │ Topic A │ │ Topic B │ │ Topic C │ │ ... │ │ │ │ │ (Parquet) │ │ (Parquet) │ │ (Parquet) │ │ (Parquet) │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ └──────────────────────────────────────────────────────────────────────────│──┘
│
Data Access
```