Database Automation with LLMs
KnowledgeFlowDB provides a powerful automation system that lets you trigger LLM operations automatically when graph events occur. This enables intelligent, self-maintaining knowledge graphs that can:
- 🤖 Auto-summarize documents as they're added
- 🔍 Auto-embed content for semantic search
- 🏷️ Auto-classify and tag nodes
- 🔗 Auto-extract entities and relationships
- ✨ Custom operations with your own prompts
How It Works
The automation system follows a simple trigger → action pattern:
- Trigger: An event occurs in your graph (node created, updated, etc.)
- Filter: Check if the event matches your rule's criteria
- Execute: Run an LLM operation on the matching data
- Output: Store the result back in the graph
graph LR
A[Graph Event] --> B{Matches Rule?}
B -->|Yes| C[Extract Input]
C --> D[Call LLM]
D --> E[Store Output]
B -->|No| F[Ignore]
Key Concepts
Triggers
Events that can activate automation rules:
node_created- A new node is addednode_updated- Node properties changenode_deleted- Node is removededge_created- New relationship creatededge_deleted- Relationship removed
Filters
Narrow down which events trigger your rule:
- Labels: Only match nodes/edges with specific labels (e.g.,
["Document", "Article"]) - Properties: Only match when specific properties exist (e.g.,
has_property: "content") - Custom: Advanced filtering with JSON criteria
Operations
LLM tasks to perform:
summarize- Generate concise summariesgenerate_embedding- Create vector embeddingsextract_entities- Pull out named entitiesclassify- Categorize contentcustom- Run your own prompt template
Output Strategies
How to save LLM results:
update_property- Add/update a property on the same nodecreate_node- Create a new node with the resultcreate_edge- Create a relationship to another node
Supported LLM Providers
| Provider | Models | Use Case |
|---|---|---|
gemini-2.5-flash-preview-09-2025 (recommended) | Fast, accurate, cost-effective | |
| OpenAI | TBD | High-quality text generation |
| Anthropic | TBD | Agentic coding |
For production use, we recommend Gemini 2.5 Flash (gemini-2.5-flash-preview-09-2025):
- ⚡ Fastest response times (under 500ms avg)
- 💰 Most cost-effective ($0.075 per 1M tokens)
- ✅ Validated in production with 3-node ScyllaDB cluster
Interactive Playground
Try creating and managing automation rules right here in the docs! You can:
- ✅ Use the example database to see how automation works
- ✅ Connect to your own database (local or production)
- ✅ Create rules from templates or build custom rules
- ✅ Monitor executions in real-time
Automation Playground
Active Rules (0)
No automation rules yet
Click "Create Rule" to get started
Recent Executions (0)
No executions yet
Executions will appear here when rules are triggered
Connection Options
Option 1: Example Database (Recommended for Learning)
The playground connects to a production 3-node ScyllaDB cluster by default. This lets you:
- See real automation rules in action
- Experiment without setting up infrastructure
- Learn by example with pre-configured rules
Just click "Create Rule" and start experimenting!
Option 2: Your Own Database
Click "Connect Your DB" to use your own KnowledgeFlowDB instance:
Local 3-Node Cluster:
API Endpoint: http://localhost:8080/api/v1
API Key: YOUR_API_KEY
Production Cluster:
API Endpoint: http://35.223.203.166/api/v1
API Key: YOUR_API_KEY
Your API key is stored locally in your browser only. It's never sent to our documentation server.
Quick Start: Your First Rule
Let's create a rule that auto-summarizes documents:
1. Set Trigger
- Trigger Type:
node_created - Match Labels:
Document - Must Have Property:
content
2. Configure LLM
- Operation:
summarize - Provider:
google - Model:
gemini-2.5-flash-preview-09-2025 - Prompt Template:
Summarize the following text in 2-3 sentences:\n\n{content} - Input Property:
content
3. Set Output
- Strategy:
update_property - Output Property:
summary
4. Activate
- Is Active: ✅ Yes
That's it! Now whenever you create a node with label Document and a content property, it will automatically get a summary property added.
Rule Templates
Use these pre-built templates as starting points:
Auto-summarize Documents
Automatically generate summaries for new documents using Gemini.
Best for: Blog posts, articles, documentation
Trigger: Node created with label Document or Article and property content
Output: Adds summary property with 2-3 sentence summary
Auto-embed Code Files
Generate embeddings for newly created code files for semantic search.
Best for: Code repositories, documentation
Trigger: Node created with label File and property content
Output: Adds embedding property with 1024-dim vector
Extract Entities
Extract named entities (people, places, organizations) from content.
Best for: News articles, research papers
Trigger: Node created with label Document and property content
Output: Creates new Entity nodes linked to the document
Classify Content
Automatically categorize documents into predefined categories.
Best for: Content management, organization
Trigger: Node created with label Document and property content
Output: Adds category property (Technical, Business, Research, etc.)
Best Practices
1. Start with Templates
Use the built-in templates and customize them for your needs. This ensures you start with validated configurations.
2. Test with Inactive Rules First
Create rules with is_active: false, test manually, then activate once validated.
3. Monitor Token Usage
Check the executions table to track:
- Token consumption
- Costs per execution
- Success/failure rates
4. Use Specific Filters
Narrow down triggers with labels and property filters to avoid unnecessary LLM calls:
✅ Good:
{
"labels": ["Document"],
"has_property": "content"
}
❌ Too broad:
{
"labels": [] // Matches ALL nodes!
}
5. Set Reasonable Limits
Configure max_tokens based on your operation:
- Summaries: 100-200 tokens
- Embeddings: No limit needed
- Entity extraction: 300-500 tokens
- Classification: 10-50 tokens
6. Handle Failures Gracefully
Monitor execution logs for failures and adjust:
- Prompt templates that are too vague
- Token limits that are too low
- Missing properties in input data
API Reference
See the Automation API Reference for complete endpoint documentation.
Next Steps
- 📖 Read the Automation API Reference
- 🔧 Set up a local 3-node cluster (see deployment docs)
- 🚀 Deploy to production (see deployment docs)