Security Guard System

The Security Guard system provides protection against dangerous commands and operations in Stata dofiles. It acts as a safety layer between LLM-generated code and actual execution.

Overview

The Security Guard is enabled by default to prevent accidental execution of destructive operations. It validates all dofile code against a blacklist of dangerous commands and patterns before execution.

Key Features

Blacklist-Based Validation: Blocks known dangerous commands
Pattern Matching: Detects potentially dangerous operations using regex
Line-by-Line Analysis: Provides detailed risk reporting with line numbers
Configurable: Can be disabled if needed (not recommended)
Zero False Positives for Safe Code: Only flags genuinely dangerous operations

Dangerous Commands

The Security Guard blocks the following dangerous commands:

Shell Execution Commands

Command	Description	Example
`!`	Unix-style shell escape	`! ls -la`
`!!`	Extended shell command	`!! vi file.do`
`shell`	Shell command execution	`shell dir`
`xshell`	Extended shell for Mac/Unix(GUI)	`xshell vi file.do`
`winexec`	Windows program execution	`winexec notepad.exe`
`unixcmd`	Unix command execution	`unixcmd ls`

File Operations

Command	Description	Risk
`erase`	File deletion	Data loss
`rm`	File deletion (alias)	Data loss
`rmdir`	Directory removal	Data loss
`copy`	File copy	Can overwrite files

Code Execution

Command	Description	Risk
`run`	Run another do-file	Untrusted code execution
`do`	Execute do-file	Untrusted code execution
`include`	Include another do-file	Untrusted code execution

Configuration

Enable/Disable Security Guard

Option 1: Configuration File

Edit ~/.statamcp/config.toml:

[SECURITY]
IS_GUARD = true  # Default: true

Option 2: Environment Variable

# Enable (default)
export STATA_MCP__IS_GUARD=true

# Disable (not recommended)
export STATA_MCP__IS_GUARD=false

Default Behavior

Default: Enabled (IS_GUARD = true)
Recommended: Keep enabled for production use
Development: Can be disabled for testing (use with caution)

Usage

Basic Usage

The Security Guard is automatically applied when using the stata_do tool:

# When IS_GUARD is enabled (default)
result = stata_mcp.stata_do(code="""
    sysuse auto
    regress price mpg weight
""")

# Safe code executes normally

Security Validation Example

When dangerous code is detected:

result = stata_mcp.stata_do(code="""
    sysuse auto
    ! rm -rf /  # Dangerous command
""")

# Error: Security validation failed
# ❌ Security validation failed. Found dangerous items:
#   - Line 3: command '!'

Programmatic Usage

If you're using Stata-MCP as a library:

from stata_mcp.guard import GuardValidator

# Create validator
validator = GuardValidator()

# Validate code
code = """
sysuse auto
regress price mpg weight
"""

report = validator.validate(code)

if report.is_safe:
    print("✅ Code is safe to execute")
else:
    print(f"❌ Found {len(report.dangerous_items)} dangerous items:")
    for item in report.dangerous_items:
        print(f"  {item}")

Security Report

SecurityReport Object

@dataclass
class SecurityReport:
    is_safe: bool                              # True if no dangerous items found
    dangerous_items: List[RiskItem]            # List of detected risks

RiskItem Object

@dataclass
class RiskItem:
    type: str                                  # "command" or "pattern"
    content: str                               # The dangerous content
    line: int                                  # Line number (1-indexed)

Example Output

# Safe code
report = validator.validate("sysuse auto")
print(report)
# Output: ✅ Code passed security validation

# Dangerous code
report = validator.validate("! rm file.txt")
print(report)
# Output:
# ❌ Security validation failed. Found dangerous items:
#   - Line 1: command '!'

Dangerous Patterns

The Security Guard uses regex patterns to detect dangerous operations:

Shell Command Patterns

r"!\s*\w+"           # Shell escape with command: ! ls
r"!!\s*\w+"          # Extended shell: !! vi file.do
r"shell\s+\w+"       # Shell command: shell dir
r"xshell\s+\w+"      # Extended shell: xshell vi file.do
r"winexec\s+\S+"     # Windows execution: winexec program.exe
r"unixcmd\s+\w+"     # Unix command: unixcmd ls

File Operation Patterns

r"erase\s+.*"        # File deletion: erase file.dta
r"rm\s+.*"           # File deletion: rm file.dta
r"rmdir\s+.*"        # Directory removal: rmdir mydir
r"copy\s+.*"         # File copy: copy file1.dta file2.dta

Code Execution Patterns

r"run\s+.*"          # Run do-file: run script.do
r"\bdo\s+.*"         # Execute do-file: do script.do
r"include\s+.*"      # Include do-file: include setup.do

Validation Process

Step-by-Step Validation

Code Input: Receive dofile code string
Line Split: Split code into lines for line number tracking
Filtering: Skip empty lines and comments (starting with *)
Command Check: Check each line against dangerous commands
Pattern Check: Check each line against dangerous patterns
Report Generation: Create SecurityReport with all findings

Example Validation

code = """
* This is a comment
sysuse auto
! rm dangerous.txt  # Line 3
regress price mpg
"""

report = validator.validate(code)
# Report:
# ❌ Security validation failed. Found dangerous items:
#   - Line 3: command '!'

Best Practices

1. Keep Security Guard Enabled

[SECURITY]
IS_GUARD = true  # Always keep enabled

2. Review Security Reports

Always review security validation reports:

report = validator.validate(code)
if not report.is_safe:
    # Log or notify about dangerous items
    for item in report.dangerous_items:
        logger.warning(f"Dangerous item found: {item}")

3. Use Whitelist for Allowed Operations

If you need to execute certain dangerous operations:

Review the code manually
Remove dangerous commands
Use safe alternatives

Example:

* Instead of: ! rm tempfile.dta
* Use: erase tempfile.dta  (still blocked, but shows intent)

* Better approach: Use Stata's built-in safe operations
capture erase tempfile.dta

4. Educate Users

Document which commands are blocked and why:

## Blocked Commands

The following commands are blocked for security reasons:
- Shell commands (!, shell, xshell, etc.)
- File deletion (erase, rm)
- External code execution (run, do, include)

Please use safe alternatives or contact administrator for assistance.

Troubleshooting

False Positives

If you believe a command is incorrectly flagged:

Review the command: Is it actually dangerous?
Check for patterns: Does it match a dangerous pattern?
Consider alternatives: Is there a safer way to accomplish the task?

Disabling Security Guard

⚠️ Warning: Disabling the Security Guard is not recommended.

Only disable if: - You're in a trusted environment - All code is manually reviewed - You understand the risks

# Temporary disable (current session only)
export STATA_MCP__IS_GUARD=false
stata-mcp

# Permanent disable (add to config)
# Edit ~/.statamcp/config.toml
[SECURITY]
IS_GUARD = false

Customizing Blacklist

You can extend the blacklist by modifying the code:

from stata_mcp.guard.blacklist import DANGEROUS_COMMANDS

# Add custom dangerous commands
DANGEROUS_COMMANDS.add("my_dangerous_command")

# Use custom validator
from stata_mcp.guard import GuardValidator
validator = GuardValidator()
validator.dangerous_commands.add("another_command")

Security Considerations

What the Guard Protects Against

✅ Prevents: - Shell command execution - File deletion operations - Untrusted code execution - System-level operations

❌ Does Not Prevent: - Data modification within Stata - Infinite loops - Memory exhaustion - Stata crashes

Limitations

The Security Guard: - Does not analyze data flow - Does not track variable values - Does not prevent resource exhaustion - Is not a substitute for proper code review

Defense in Depth

The Security Guard is one layer of protection. Combine with:

Monitoring: Enable RAM monitoring (see Monitoring Documentation)
Sandboxing: Use isolated environments for untrusted code
Code Review: Manually review generated code
Backups: Maintain regular backups of important data

Integration with MCP Tools

Automatic Integration

The Security Guard is automatically integrated with:

stata_do: Execute Stata do-files (validated by default)
write_dofile: Create do-files (not validated until execution)
append_dofile: Append to do-files (not validated until execution)

Validation Flow

User Request → MCP Tool → Security Guard → Validation
                                      ↓
                                 Pass? → Execute
                                      ↓
                                 Fail? → Return Error

Extensibility

Creating Custom Validators

You can create custom validators for specific use cases:

from stata_mcp.guard import GuardValidator, RiskItem, SecurityReport

class CustomValidator(GuardValidator):
    """Custom validator with additional rules."""

    def validate(self, code: str) -> SecurityReport:
        # Get base validation results
        report = super().validate(code)

        # Add custom validation logic
        if "custom_dangerous_thing" in code:
            report.dangerous_items.append(
                RiskItem(
                    type="custom",
                    content="custom_dangerous_thing",
                    line=code.find("custom_dangerous_thing")
                )
            )
            report.is_safe = False

        return report

Combining Multiple Validators

from stata_mcp.guard import GuardValidator

# Create multiple validators
basic_validator = GuardValidator()
custom_validator = CustomValidator()

# Validate with both
code = "some stata code"
report1 = basic_validator.validate(code)
report2 = custom_validator.validate(code)

# Combine results
if report1.is_safe and report2.is_safe:
    print("✅ All validations passed")

Examples

Example 1: Safe Code Execution

from stata_mcp.guard import GuardValidator

validator = GuardValidator()

safe_code = """
* Load sample data
sysuse auto

* Run regression
regress price mpg weight

* Display results
display "R-squared: " + string(e(r2))
"""

report = validator.validate(safe_code)
print(report)
# Output: ✅ Code passed security validation

Example 2: Dangerous Code Detection

dangerous_code = """
sysuse auto

* This will be blocked
! rm -rf /important/data

regress price mpg
"""

report = validator.validate(dangerous_code)
print(report)
# Output:
# ❌ Security validation failed. Found dangerous items:
#   - Line 5: command '!'

Example 3: Multiple Violations

multiple_violations = """
sysuse auto
shell delete file.txt
run untrusted_script.do
"""

report = validator.validate(multiple_violations)
print(f"Found {len(report.dangerous_items)} violations:")
for item in report.dangerous_items:
    print(f"  {item}")
# Output:
# Found 2 violations:
#   Line 3: command 'shell'
#   Line 4: pattern 'run\s+.*'

Summary

The Security Guard system provides essential protection for automated Stata execution:

✅ Enabled by default for safety
✅ Blocks dangerous commands (shell, file deletion, etc.)
✅ Pattern-based detection for comprehensive coverage
✅ Detailed reporting with line numbers
✅ Configurable for different use cases
✅ Zero overhead when disabled (not recommended)

For production use, always keep the Security Guard enabled and review validation reports regularly.