Security Guard System
The Security Guard system provides protection against dangerous commands and operations in Stata dofiles. It acts as a safety layer between LLM-generated code and actual execution.
Overview
The Security Guard is enabled by default to prevent accidental execution of destructive operations. It validates all dofile code against a blacklist of dangerous commands and patterns before execution.
Key Features
- Blacklist-Based Validation: Blocks known dangerous commands
- Pattern Matching: Detects potentially dangerous operations using regex
- Line-by-Line Analysis: Provides detailed risk reporting with line numbers
- Configurable: Can be disabled if needed (not recommended)
- Zero False Positives for Safe Code: Only flags genuinely dangerous operations
Dangerous Commands
The Security Guard blocks the following dangerous commands. Stata resolves any unambiguous prefix to its full command (for example sh to shell), so the blacklist enumerates both the full names and the minimum abbreviations that Stata accepts. GuardValidator matches both forms before execution.
Shell Execution Commands
| Command | Abbreviation | Description | Example |
|---|---|---|---|
! |
- | Unix-style shell escape | ! ls -la |
!! |
- | Extended shell command | !! vi file.do |
shell |
sh |
Shell command execution | shell dir |
xshell |
xsh |
Extended shell for Mac/Unix (GUI) | xshell vi file.do |
winexec |
winex |
Windows program execution | winexec notepad.exe |
unixcmd |
unixc |
Unix command execution (macOS/Linux) | unixcmd ls |
File Operations
| Command | Abbreviation | Description | Risk |
|---|---|---|---|
erase |
era |
File deletion | Data loss |
rm |
- | File deletion (alias) | Data loss |
rmdir |
rmd |
Directory removal | Data loss |
copy |
- | File copy | Can overwrite files |
Code Execution
| Command | Description | Risk |
|---|---|---|
run |
Run another do-file | Untrusted code execution |
do |
Execute do-file | Untrusted code execution |
include |
Include another do-file | Untrusted code execution |
Dofile Directory Boundary
Since v1.16.2, stata_do rejects any dofile that resolves outside an approved root directory. The boundary check runs before the Guard validator and applies to both the MCP server and the CLI stata-mcp tool do entry point.
Allowed Roots
A dofile is accepted only when its resolved absolute path lies under one of the following directories:
| Root | Source |
|---|---|
<WORKING_DIR>/<FOLDER_TAG>/stata-mcp-dofile/ |
STATA_MCP_FOLDER.DO |
<WORKING_DIR> |
Configured working directory |
WORKING_DIR is taken from STATA_MCP__CWD (or the legacy STATA_MCP_CWD); the dofile root is the stata-mcp-dofile/ subdirectory under it. Symbolic links are resolved before the check, so a link pointing outside the allowed roots is denied.
Rejection Behaviour
When the resolved path is outside the allowed roots, stata_do returns an error of the form Access denied: Dofile '<path>' is outside allowed directories. together with the list of allowed roots, and a [SECURITY VIOLATION] warning is written to the log with the requested path, the resolved path, and the configured roots.
Operational Guidance
Place dofiles either under the configured working directory or let MCP-for-Stata generate them into stata-mcp-dofile/. To execute dofiles that live elsewhere, point STATA_MCP__CWD at a parent directory that already contains them rather than relaxing the check.
Local Macro Expansion Detection
A naive blacklist that only inspects each line's first token can be bypassed by hiding a dangerous command inside a local macro and expanding it later. Since v1.16.2 GuardValidator performs a two-pass check that closes this gap.
Detection Logic
- Pass one collects every
local <name> "<value>"(andlocal <name> = "<value>") definition. When the trimmed value is itself inDANGEROUS_COMMANDS, the local name is added to a tainted set. - Pass two scans every non-comment line for
`name'expansions of any tainted local and records each match as amacrorisk in theSecurityReport.
The local name pattern is \w+, so suffixed or prefixed variants such as local cmd_x "shell" are tracked alongside plain names. Stata prefixes (capture, quietly, noisily, ...) are stripped before both passes, so they cannot mask a tainted definition.
Counter-Example
local cmd_x "shell"
`cmd_x' rm -rf /tmp/important
Without macro tracking the second line begins with a backtick and would slip past a first-token check. With macro tracking enabled the validator reports a macro risk on the second line, and stata_do refuses execution.
Coverage Caveats
The macro tracker matches single-token dangerous values only. Concatenated assignments such as local cmd = "she" + "ll" or values constructed at runtime are outside its scope and remain a known limitation of static validation. Keep the Guard enabled and review generated dofiles when external sources contribute their content.
Guard Disable Warning
STATA_MCP__IS_GUARD=false (or [SECURITY] IS_GUARD = false in ~/.statamcp/config.toml) disables the validator entirely. Disabling the Guard skips every blacklist, pattern, and macro check, so any dofile content reaches Stata unchanged. Each stata_do call emits the warning [SECURITY] Guard is disabled. Dangerous dofile commands will not be blocked. to the log; the same line is recorded at server startup when the configuration is read.
Disable the Guard only inside controlled environments such as the Docker sandbox installation or an ephemeral VM, and re-enable it once the trusted task completes. The directory boundary check above continues to apply even when the Guard is disabled.
Configuration
Enable/Disable Security Guard
Option 1: Configuration File
Edit ~/.statamcp/config.toml:
[SECURITY]
IS_GUARD = true # Default: true
Option 2: Environment Variable
# Enable (default)
export STATA_MCP__IS_GUARD=true
# Disable (not recommended)
export STATA_MCP__IS_GUARD=false
Default Behavior
- Default: Enabled (
IS_GUARD = true) - Recommended: Keep enabled for production use
- Development: Can be disabled for testing (use with caution)
Usage
Basic Usage
The Security Guard is automatically applied when using the stata_do tool:
# When IS_GUARD is enabled (default)
result = stata_mcp.stata_do(code="""
sysuse auto
regress price mpg weight
""")
# Safe code executes normally
Security Validation Example
When dangerous code is detected:
result = stata_mcp.stata_do(code="""
sysuse auto
! rm -rf / # Dangerous command
""")
# Error: Security validation failed
# ❌ Security validation failed. Found dangerous items:
# - Line 3: command '!'
Programmatic Usage
If you're using MCP-for-Stata as a library:
from stata_mcp.guard import GuardValidator
# Create validator
validator = GuardValidator()
# Validate code
code = """
sysuse auto
regress price mpg weight
"""
report = validator.validate(code)
if report.is_safe:
print("✅ Code is safe to execute")
else:
print(f"❌ Found {len(report.dangerous_items)} dangerous items:")
for item in report.dangerous_items:
print(f" {item}")
Security Report
SecurityReport Object
@dataclass
class SecurityReport:
is_safe: bool # True if no dangerous items found
dangerous_items: List[RiskItem] # List of detected risks
RiskItem Object
@dataclass
class RiskItem:
type: str # "command" or "pattern"
content: str # The dangerous content
line: int # Line number (1-indexed)
Example Output
# Safe code
report = validator.validate("sysuse auto")
print(report)
# Output: ✅ Code passed security validation
# Dangerous code
report = validator.validate("! rm file.txt")
print(report)
# Output:
# ❌ Security validation failed. Found dangerous items:
# - Line 1: command '!'
Dangerous Patterns
The Security Guard uses regex patterns to detect dangerous operations:
Shell Command Patterns
r"!\s*\w+" # Shell escape with command: ! ls
r"!!\s*\w+" # Extended shell: !! vi file.do
r"shell\s+\w+" # Shell command: shell dir
r"xshell\s+\w+" # Extended shell: xshell vi file.do
r"winexec\s+\S+" # Windows execution: winexec program.exe
r"unixcmd\s+\w+" # Unix command: unixcmd ls
File Operation Patterns
r"erase\s+.*" # File deletion: erase file.dta
r"rm\s+.*" # File deletion: rm file.dta
r"rmdir\s+.*" # Directory removal: rmdir mydir
r"copy\s+.*" # File copy: copy file1.dta file2.dta
Code Execution Patterns
r"run\s+.*" # Run do-file: run script.do
r"\bdo\s+.*" # Execute do-file: do script.do
r"include\s+.*" # Include do-file: include setup.do
Validation Process
Step-by-Step Validation
- Code Input: Receive dofile code string
- Line Split: Split code into lines for line number tracking
- Filtering: Skip empty lines and comments (starting with
*) - Command Check: Check each line against dangerous commands
- Pattern Check: Check each line against dangerous patterns
- Report Generation: Create SecurityReport with all findings
Example Validation
code = """
* This is a comment
sysuse auto
! rm dangerous.txt # Line 3
regress price mpg
"""
report = validator.validate(code)
# Report:
# ❌ Security validation failed. Found dangerous items:
# - Line 3: command '!'
Best Practices
1. Keep Security Guard Enabled
[SECURITY]
IS_GUARD = true # Always keep enabled
2. Review Security Reports
Always review security validation reports:
report = validator.validate(code)
if not report.is_safe:
# Log or notify about dangerous items
for item in report.dangerous_items:
logger.warning(f"Dangerous item found: {item}")
3. Use Whitelist for Allowed Operations
If you need to execute certain dangerous operations:
- Review the code manually
- Remove dangerous commands
- Use safe alternatives
Example:
* Instead of: ! rm tempfile.dta
* Use: erase tempfile.dta (still blocked, but shows intent)
* Better approach: Use Stata's built-in safe operations
capture erase tempfile.dta
4. Educate Users
Document which commands are blocked and why:
## Blocked Commands
The following commands are blocked for security reasons:
- Shell commands (!, shell, xshell, etc.)
- File deletion (erase, rm)
- External code execution (run, do, include)
Please use safe alternatives or contact administrator for assistance.
Troubleshooting
False Positives
If you believe a command is incorrectly flagged:
- Review the command: Is it actually dangerous?
- Check for patterns: Does it match a dangerous pattern?
- Consider alternatives: Is there a safer way to accomplish the task?
Disabling Security Guard
⚠️ Warning: Disabling the Security Guard is not recommended.
Only disable if: - You're in a trusted environment - All code is manually reviewed - You understand the risks
# Temporary disable (current session only)
export STATA_MCP__IS_GUARD=false
stata-mcp
# Permanent disable (add to config)
# Edit ~/.statamcp/config.toml
[SECURITY]
IS_GUARD = false
Customizing Blacklist
You can extend the blacklist by modifying the code:
from stata_mcp.guard.blacklist import DANGEROUS_COMMANDS
# Add custom dangerous commands
DANGEROUS_COMMANDS.add("my_dangerous_command")
# Use custom validator
from stata_mcp.guard import GuardValidator
validator = GuardValidator()
validator.dangerous_commands.add("another_command")
Security Considerations
What the Guard Protects Against
✅ Prevents: - Shell command execution - File deletion operations - Untrusted code execution - System-level operations
❌ Does Not Prevent: - Data modification within Stata - Infinite loops - Memory exhaustion - Stata crashes
Limitations
The Security Guard: - Does not analyze data flow - Does not track variable values - Does not prevent resource exhaustion - Is not a substitute for proper code review
Defense in Depth
The Security Guard is one layer of protection. Combine with:
- Monitoring: Enable RAM monitoring (see Monitoring Documentation)
- Sandboxing: Use isolated environments for untrusted code
- Code Review: Manually review generated code
- Backups: Maintain regular backups of important data
Integration with MCP Tools
Automatic Integration
The Security Guard is automatically integrated with:
stata_do: Execute Stata do-files (validated by default)write_dofile: Create do-files (not validated until execution)
Validation Flow
User Request → MCP Tool → Security Guard → Validation
↓
Pass? → Execute
↓
Fail? → Return Error
Extensibility
Creating Custom Validators
You can create custom validators for specific use cases:
from stata_mcp.guard import GuardValidator, RiskItem, SecurityReport
class CustomValidator(GuardValidator):
"""Custom validator with additional rules."""
def validate(self, code: str) -> SecurityReport:
# Get base validation results
report = super().validate(code)
# Add custom validation logic
if "custom_dangerous_thing" in code:
report.dangerous_items.append(
RiskItem(
type="custom",
content="custom_dangerous_thing",
line=code.find("custom_dangerous_thing")
)
)
report.is_safe = False
return report
Combining Multiple Validators
from stata_mcp.guard import GuardValidator
# Create multiple validators
basic_validator = GuardValidator()
custom_validator = CustomValidator()
# Validate with both
code = "some stata code"
report1 = basic_validator.validate(code)
report2 = custom_validator.validate(code)
# Combine results
if report1.is_safe and report2.is_safe:
print("✅ All validations passed")
Examples
Example 1: Safe Code Execution
from stata_mcp.guard import GuardValidator
validator = GuardValidator()
safe_code = """
* Load sample data
sysuse auto
* Run regression
regress price mpg weight
* Display results
display "R-squared: " + string(e(r2))
"""
report = validator.validate(safe_code)
print(report)
# Output: ✅ Code passed security validation
Example 2: Dangerous Code Detection
dangerous_code = """
sysuse auto
* This will be blocked
! rm -rf /important/data
regress price mpg
"""
report = validator.validate(dangerous_code)
print(report)
# Output:
# ❌ Security validation failed. Found dangerous items:
# - Line 5: command '!'
Example 3: Multiple Violations
multiple_violations = """
sysuse auto
shell delete file.txt
run untrusted_script.do
"""
report = validator.validate(multiple_violations)
print(f"Found {len(report.dangerous_items)} violations:")
for item in report.dangerous_items:
print(f" {item}")
# Output:
# Found 2 violations:
# Line 3: command 'shell'
# Line 4: pattern 'run\s+.*'
Summary
The Security Guard system provides essential protection for automated Stata execution:
- ✅ Enabled by default for safety
- ✅ Blocks dangerous commands (shell, file deletion, etc.)
- ✅ Pattern-based detection for comprehensive coverage
- ✅ Detailed reporting with line numbers
- ✅ Configurable for different use cases
- ✅ Zero overhead when disabled (not recommended)
For production use, always keep the Security Guard enabled and review validation reports regularly.