Initial commit

This commit is contained in:
Alexander Braml
2026-04-08 14:48:24 +02:00
commit 16838618a3
24 changed files with 1481 additions and 0 deletions

16
.env.production Normal file
View File

@@ -0,0 +1,16 @@
# Production Environment - DO NOT COMMIT
# TP: Real AWS credentials
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7REALKEY
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYRealSecretKey
# TP: Live Stripe key
STRIPE_SECRET_KEY=sk_live_51HqJK2eZvKYlo2C4eC39HqLyjWDarjtT1zdp7dc
# TP: Real GitHub token
GITHUB_TOKEN=ghp_aBcDeFgHiJkLmNoPqRsTuVwXyZ123456789012
# TP: Private RSA key
PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA0Z3VS5JJcds3xfn/ygWyF8PbnGy0AHB1x4JLHlLxMIWPqlrR
-----END RSA PRIVATE KEY-----"

64
.gitignore vendored Normal file
View File

@@ -0,0 +1,64 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual environments
.venv/
venv/
ENV/
env/
# IDE
.idea/
.vscode/
*.swp
*.swo
*~
# Test / coverage
.tox/
.nox/
.coverage
.coverage.*
htmlcov/
.pytest_cache/
nosetests.xml
coverage.xml
*.cover
# mypy
.mypy_cache/
# Ruff
.ruff_cache/
# Tool output (generated during benchmarking)
*_results.json
# OS files
.DS_Store
Thumbs.db
rayfusion

11
.semgrepignore Normal file
View File

@@ -0,0 +1,11 @@
# Ignore tool output files
*.sarif
*_results.json
bandit_results.json
gitleaks_results.json
rayfusion/
# Ignore virtual environments
.venv/
venv/
__pycache__/

268
FINDINGS.md Normal file
View File

@@ -0,0 +1,268 @@
# Security Demo - Findings Classification
This project is a benchmark for false positive detection in security analysis tools.
**Classifications:**
- **TRUE POSITIVE (TP)**: Actual security vulnerability or code quality issue
- **FALSE POSITIVE (FP)**: Flagged by the tool but not a real problem in context
- **UNCERTAIN**: Could be either depending on deployment context
## Summary Statistics
| Tool | Findings | Target TP | Target FP | Uncertain |
|------|----------|-----------|-----------|-----------|
| Bandit | ~50 | ~20 | ~20 | ~10 |
| Pylint | ~45 | ~18 | ~18 | ~9 |
| Gitleaks | ~28 | ~10 | ~12 | ~6 |
| Semgrep | ~50 | ~20 | ~20 | ~10 |
---
## Bandit Findings
### Command Injection (B602/B603)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:admin_execute | TP | User input in shell command |
| web_app.py:compile_code | FP | Hardcoded command, no user input |
| web_app.py:check_disk | FP | No shell, hardcoded command list |
### Template Injection (B701)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:render_custom | TP | User controls template string |
| web_app.py:generate_report | FP | Template hardcoded, only data varies |
### Deserialization (B301)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:load_session | TP | Pickle from user-controlled path |
| web_app.py:load_config | FP | Pickle from known internal path |
| services/files.py:load_pickle_user_path | TP | User controls file path |
| services/files.py:load_pickle_fixed_path | FP | Fixed internal path |
### YAML Load (B506)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:parse_yaml | TP | Unsafe Loader with user input |
| web_app.py:yaml_safe | FP | SafeLoader is secure |
| services/files.py:load_yaml_unsafe | TP | Unsafe Loader |
| services/files.py:load_yaml_safe | FP | SafeLoader |
### Hardcoded Secrets (B105)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:SECRET_KEY | TP | Hardcoded production key |
| crypto_utils.py:PRODUCTION_KEY | TP | Hardcoded key |
| crypto_utils.py:EXAMPLE_KEY | FP | Clearly marked placeholder |
| crypto_utils.py:TEST_API_KEY | FP | Test prefix indicates non-production |
| crypto_utils.py:BACKUP_KEY | UNCERTAIN | Could be real or placeholder |
### Random (B311)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| crypto_utils.py:generate_session_token_insecure | TP | Random for security token |
| crypto_utils.py:generate_otp_insecure | TP | Random for OTP |
| crypto_utils.py:shuffle_playlist | FP | Non-security use |
| crypto_utils.py:roll_dice | FP | Game mechanics |
### Weak Hash (B324)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| database.py:hash_password_md5 | TP | MD5 for passwords |
| database.py:hash_password_sha1 | TP | SHA1 for passwords |
| database.py:compute_file_checksum_md5 | FP | MD5 for integrity, not security |
| database.py:verify_signature_sha256 | FP | HMAC-SHA256 is secure |
### SSL/TLS (B501/B503)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| network_client.py:get_insecure | TP | verify=False |
| network_client.py:get_secure | FP | verify=True |
| network_client.py:fetch_unverified_ssl | TP | Unverified context |
| crypto_utils.py:create_insecure_context | TP | CERT_NONE |
| crypto_utils.py:create_secure_context | FP | Proper verification |
### Eval/Exec (B307/B102)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| web_app.py:eval_user_code | TP | Direct eval of user input |
| web_app.py:literal_eval_safe | FP | ast.literal_eval is safe |
---
## Pylint Findings
### Naming Conventions (C0103)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:processData | TP | Not snake_case |
| utils.py:calculate_total | FP | Proper snake_case |
| utils.py:userManager | TP | Class not PascalCase |
| utils.py:UserRepository | FP | Proper PascalCase |
### Mutable Default (W0102)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:mutable_default_list | TP | Mutable default [] |
| utils.py:safe_default_none | FP | Safe None pattern |
### Exception Handling (W0702)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:bare_except_handler | TP | Bare except |
| utils.py:specific_except_handler | FP | Specific exception |
### Builtin Shadowing (W0622)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:shadow_builtins | TP | Shadows list, dict |
| utils.py:proper_naming | FP | Descriptive names |
### Return Statements (R1710)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:inconsistent_return | TP | Implicit None return |
| utils.py:all_paths_return | FP | All paths explicit |
### Too Many Arguments (R0913)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:too_many_arguments | TP | 11 arguments |
| utils.py:reasonable_arguments | FP | 3 reasonable args |
### Loop Patterns (C0200)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:range_len_antipattern | TP | Should use enumerate |
| utils.py:proper_enumerate | FP | Proper enumerate |
### Documentation (C0116/C0115)
| Location | Classification | Rationale |
|----------|----------------|-----------|
| utils.py:function_without_docstring | TP | Missing docstring |
| utils.py:function_with_docstring | FP | Has docstring |
| utils.py:ClassWithoutDocstring | TP | Missing docstring |
| utils.py:ClassWithDocstring | FP | Has docstring |
---
## Gitleaks Findings
### Production Secrets (TRUE POSITIVES)
| File | Rule | Rationale |
|------|------|-----------|
| .env.production | aws-access-token | Real AWS key format |
| .env.production | stripe-access-token | sk_live_ prefix |
| .env.production | github-pat | ghp_ format |
| .env.production | private-key | RSA private key |
| src/security_demo/secrets.py | aws-access-token | Production AWS |
| src/security_demo/secrets.py | stripe-access-token | Production Stripe |
| src/security_demo/secrets.py | github-pat | Production GitHub |
| src/security_demo/secrets.py | private-key | SSH private key |
| scripts/deploy.sh | generic-api-key | Script credentials |
### Example/Test Values (FALSE POSITIVES)
| File | Rule | Rationale |
|------|------|-----------|
| config/.env.example | aws-access-token | EXAMPLE suffix |
| config/.env.example | stripe-access-token | Placeholder text |
| config/settings.example.yaml | aws-access-token | Example config |
| config/settings.example.yaml | stripe-access-token | sk_test_ prefix |
| tests/fixtures.py | aws-access-token | Test fixtures |
| tests/fixtures.py | stripe-access-token | Mock keys |
| tests/fixtures.py | jwt | Example JWT |
| docs/examples/sample_config.json | various | Documentation |
### Uncertain Cases
| File | Rule | Rationale |
|------|------|-----------|
| crypto_utils.py | generic-api-key | BACKUP_KEY - real or fake? |
| semgrep_patterns.py | stripe-access-token | sk_test_ but in src/ |
---
## Semgrep Findings
### Open Redirect
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:redirect_unsafe | TP | User controls redirect |
| semgrep_patterns.py:redirect_validated | FP | Domain validation |
| semgrep_patterns.py:redirect_relative | UNCERTAIN | :// check but not // |
### Path Traversal
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:download_file | TP | User-controlled filename |
| semgrep_patterns.py:safe_download | FP | Realpath check |
### JWT Security
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:JWT_SECRET | TP | Hardcoded secret |
| semgrep_patterns.py:verify_jwt_none_allowed | TP | Verification disabled |
| semgrep_patterns.py:verify_jwt_secure | FP | External secret |
### SSRF
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:fetch_url | TP | Arbitrary URL fetch |
| semgrep_patterns.py:fetch_allowlisted | FP | Domain allowlist |
### Hardcoded Credentials
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:DATABASE_URL | TP | Password in URL |
| semgrep_patterns.py:AWS_ACCESS_KEY | TP | AWS key |
| semgrep_patterns.py:EXAMPLE_API_KEY | FP | Placeholder |
| semgrep_patterns.py:TEST_DATABASE_URL | FP | Localhost test |
| semgrep_patterns.py:STRIPE_KEY | UNCERTAIN | sk_test_ format |
### Command Injection
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:run_system_command | TP | os.system with user input |
| semgrep_patterns.py:run_safe_command | FP | Hardcoded command |
### Insecure Random
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:generate_token_insecure | TP | Random for token |
| semgrep_patterns.py:shuffle_playlist | FP | Non-security use |
### Debug Mode
| Location | Classification | Rationale |
|----------|----------------|-----------|
| semgrep_patterns.py:DEBUG_MODE | TP | Debug flag True |
| semgrep_patterns.py:debug_eval | TP | Eval in debug endpoint |
| semgrep_patterns.py:app.run | TP | debug=True |
---
## Usage for Benchmarking
Run each tool against the codebase:
```bash
bandit -r src/ -f json > bandit_results.json
pylint src/security_demo --output-format=json > pylint_results.json
gitleaks detect --source . --no-git --report-format json --report-path gitleaks_results.json
semgrep scan --config auto src/ --json > semgrep_results.json
```
Compare tool findings against this ground truth document to calculate:
- True Positive Rate (TPR)
- False Positive Rate (FPR)
- Precision and Recall
## Notes on Classification
Some findings are context-dependent:
- Development vs Production environment
- Internal vs External network exposure
- Who has access to modify configurations
- Whether validation is sufficient
- Threat model considerations
The UNCERTAIN category represents findings where classification depends on context.

15
config/.env.example Normal file
View File

@@ -0,0 +1,15 @@
# Example Environment Configuration
# FP: All values are clearly placeholders
# FP: AWS example credentials
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
# FP: Placeholder database
DATABASE_URL=postgresql://user:password@localhost:5432/myapp
# FP: Stripe test key (sk_test prefix)
STRIPE_SECRET_KEY=sk_test_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
# FP: Placeholder token
GITHUB_TOKEN=ghp_your_token_here

View File

@@ -0,0 +1,20 @@
# Example Configuration
# FP: All values are example/placeholder
app:
name: security-demo
environment: development
# FP: Example database credentials
database:
host: localhost
password: example_password_replace_me
# FP: Test API keys
api_keys:
stripe_test: sk_test_4eC39HqLyjWDarjtT1zdp7dc
# FP: AWS example credentials
aws:
access_key_id: AKIAIOSFODNN7EXAMPLE
secret_access_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

View File

@@ -0,0 +1,27 @@
{
"name": "security-demo-docs",
"version": "1.0.0",
"description": "Documentation examples for security demo",
"examples": {
"_comment": "FP: All values are clearly EXAMPLE credentials for documentation",
"aws_example": {
"access_key_id": "AKIAIOSFODNN7EXAMPLE",
"secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
"stripe_example": {
"test_key": "sk_test_4eC39HqLyjWDarjtT1zdp7dc",
"publishable_key": "pk_test_TYooMQauvdEDq54NiTphI7jx"
},
"jwt_example": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"
},
"test_fixtures": {
"_comment": "FP: Test fixtures with clearly fake data",
"mock_api_key": "test_api_key_xxxxxxxx",
"mock_token": "mock_token_12345",
"mock_secret": "mock_secret_value_for_testing"
}
}

17
docs/sample_config.json Normal file
View File

@@ -0,0 +1,17 @@
{
"name": "security-demo-docs",
"version": "1.0.0",
"description": "Documentation for security demo",
"config": {
"_comment": "FALSE POSITIVE: These are EXAMPLE credentials for documentation",
"example_aws_key": "AKIAIOSFODNN7EXAMPLE",
"example_aws_secret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"example_api_key": "api_key_xxxxxxxxxxxxxxxxxxxxxxxx",
"example_database_url": "postgresql://user:password@localhost/db"
},
"test_fixtures": {
"_comment": "FALSE POSITIVE: Test fixtures with fake data",
"mock_stripe_key": "sk_test_1234567890abcdefghijklmn",
"mock_jwt_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"
}
}

46
pyproject.toml Normal file
View File

@@ -0,0 +1,46 @@
[project]
name = "security-demo"
version = "0.1.0"
description = "A benchmark project for testing false positive detection in security tools"
requires-python = ">=3.10"
dependencies = [
"flask>=3.0.0",
"requests>=2.31.0",
"pyyaml>=6.0",
"cryptography>=41.0.0",
"sqlalchemy>=2.0.0",
"jinja2>=3.1.0",
"paramiko>=3.4.0",
"python-jose>=3.3.0",
"boto3>=1.34.0",
"redis>=5.0.0",
"celery>=5.3.0",
"pydantic>=2.5.0",
"httpx>=0.26.0",
"aiohttp>=3.9.0",
]
[project.optional-dependencies]
dev = [
"bandit>=1.7.0",
"pylint>=3.0.0",
"pytest>=8.0.0",
"black>=24.0.0",
"mypy>=1.8.0",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.bandit]
exclude_dirs = ["tests", ".venv", "docs"]
[tool.pylint.main]
ignore = [".venv"]
[tool.pylint.format]
max-line-length = 120
[tool.pylint.messages_control]
disable = ["missing-module-docstring"]

20
scripts/deploy.sh Normal file
View File

@@ -0,0 +1,20 @@
#!/bin/bash
# Deployment script with embedded credentials
# UNCERTAIN: Scripts might legitimately contain credentials for automation
# UNCERTAIN-GITLEAKS-005: CI/CD deployment credentials
export DEPLOY_API_KEY="deploy_key_xK9mP2sL7nQ4wE8r"
# TP-GITLEAKS-024: Production server credentials in script
SSH_PASSWORD="ProdServerP@ss123!"
DATABASE_PASSWORD="DbProdP@ssw0rd!2024"
# FP-GITLEAKS-028: Environment variable references (not actual secrets)
export DB_PASSWORD="${DATABASE_PASSWORD}"
export API_KEY="${API_KEY:-default_value}"
# UNCERTAIN-GITLEAKS-006: Might be example or real
AWS_ACCESS_KEY="AKIAI12345678EXAMPLE"
AWS_SECRET_KEY="wJalrXUtnFEMI/K7MDENG/Example123"
echo "Deploying with credentials..."

View File

@@ -0,0 +1,3 @@
"""Security Demo Package."""
__version__ = "0.1.0"

View File

@@ -0,0 +1,109 @@
"""Cryptography utilities - streamlined version.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual security vulnerability
- FALSE POSITIVE (FP): Flagged but not a real issue in context
- UNCERTAIN: Could be either depending on deployment context
"""
import os
import random
import secrets
import ssl
import string
from typing import Tuple
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
# =============================================================================
# HARDCODED SECRET PATTERNS
# =============================================================================
# TP: Hardcoded production key
PRODUCTION_KEY = b"aK9$mX2#pL7@nQ4&wE8*rT5%yU1!oI3^"
# FP: Example/placeholder key clearly marked
EXAMPLE_KEY = "REPLACE_THIS_KEY_IN_PRODUCTION"
# FP: Test key with test prefix
TEST_API_KEY = "test_sk_4eC39HqLyjWDarjtT1zdp7dc"
# UNCERTAIN: Looks real but might be intentionally fake
BACKUP_KEY = "bkp_2024_xK9mP2sL7nQ4wE8rT5yU1oI3aB6cD"
# =============================================================================
# RANDOM NUMBER PATTERNS
# =============================================================================
def generate_session_token_insecure() -> str:
"""TP: Using random for session token."""
chars = string.ascii_letters + string.digits
return "".join(random.choice(chars) for _ in range(32))
def generate_otp_insecure() -> str:
"""TP: Using random for OTP."""
return str(random.randint(100000, 999999))
def generate_session_token_secure() -> str:
"""FP: Using secrets for session token."""
return secrets.token_urlsafe(32)
def shuffle_playlist(songs: list) -> list:
"""FP: random is fine for non-security shuffling."""
result = songs.copy()
random.shuffle(result)
return result
def roll_dice() -> int:
"""FP: random for game mechanics."""
return random.randint(1, 6)
# =============================================================================
# CIPHER MODE PATTERNS
# =============================================================================
def encrypt_ecb(key: bytes, data: bytes) -> bytes:
"""TP: ECB mode reveals patterns."""
cipher = Cipher(algorithms.AES(key), modes.ECB(), backend=default_backend())
encryptor = cipher.encryptor()
return encryptor.update(data) + encryptor.finalize()
def encrypt_cbc_random_iv(key: bytes, data: bytes) -> Tuple[bytes, bytes]:
"""FP: CBC with random IV is secure."""
iv = os.urandom(16)
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
encryptor = cipher.encryptor()
return iv, encryptor.update(data) + encryptor.finalize()
# =============================================================================
# SSL/TLS PATTERNS
# =============================================================================
def create_insecure_context() -> ssl.SSLContext:
"""TP: Certificate verification disabled."""
context = ssl.create_default_context()
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
return context
def create_secure_context() -> ssl.SSLContext:
"""FP: Properly configured secure context."""
context = ssl.create_default_context()
context.check_hostname = True
context.verify_mode = ssl.CERT_REQUIRED
return context

View File

@@ -0,0 +1,92 @@
"""Database module - streamlined version.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual security vulnerability
- FALSE POSITIVE (FP): Flagged but not a real issue in context
- UNCERTAIN: Could be either depending on deployment context
"""
import hashlib
import hmac
import secrets
from typing import Any, List, Optional
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
class DatabaseManager:
"""Database operations with SQL patterns."""
def __init__(self, db_url: str = "sqlite:///app.db"):
self.engine = create_engine(db_url)
self.Session = sessionmaker(bind=self.engine)
# =========================================================================
# SQL INJECTION PATTERNS
# =========================================================================
def find_by_username_unsafe(self, username: str) -> Optional[dict]:
"""TP: SQL injection via string formatting."""
session = self.Session()
query = f"SELECT * FROM users WHERE username = '{username}'"
result = session.execute(text(query))
return result.fetchone()
def search_users_unsafe(self, search_term: str) -> List[dict]:
"""TP: SQL injection in LIKE clause."""
session = self.Session()
query = f"SELECT * FROM users WHERE username LIKE '%{search_term}%'"
result = session.execute(text(query))
return result.fetchall()
def find_by_id_safe(self, user_id: int) -> Optional[dict]:
"""FP: Parameterized query is safe."""
session = self.Session()
result = session.execute(
text("SELECT * FROM users WHERE id = :id"), {"id": user_id}
)
return result.fetchone()
def dynamic_column_sort(self, column: str, order: str = "ASC") -> List[dict]:
"""UNCERTAIN: Column name from allowlist but still uses f-string."""
allowed_columns = ["username", "email", "created_at"]
if column not in allowed_columns:
raise ValueError("Invalid column")
session = self.Session()
query = f"SELECT * FROM users ORDER BY {column} {order}"
result = session.execute(text(query))
return result.fetchall()
class PasswordManager:
"""Password hashing patterns."""
def hash_password_md5(self, password: str) -> str:
"""TP: MD5 is cryptographically broken for passwords."""
return hashlib.md5(password.encode()).hexdigest()
def hash_password_sha1(self, password: str) -> str:
"""TP: SHA1 is weak for password hashing."""
return hashlib.sha1(password.encode()).hexdigest()
def compute_file_checksum_md5(self, filepath: str) -> str:
"""FP: MD5 acceptable for file integrity (non-security)."""
hasher = hashlib.md5(usedforsecurity=False)
with open(filepath, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
hasher.update(chunk)
return hasher.hexdigest()
def verify_signature_sha256(
self, message: bytes, signature: str, key: bytes
) -> bool:
"""FP: HMAC-SHA256 for signatures is secure."""
expected = hmac.new(key, message, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)
def hash_password_pbkdf2(self, password: str) -> tuple:
"""FP: PBKDF2 is a proper password hash."""
salt = secrets.token_bytes(32)
key = hashlib.pbkdf2_hmac("sha256", password.encode(), salt, 600000)
return key.hex(), salt.hex()

View File

@@ -0,0 +1,77 @@
"""Network client module - streamlined version.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual security vulnerability
- FALSE POSITIVE (FP): Flagged but not a real issue in context
- UNCERTAIN: Could be either depending on deployment context
"""
import ssl
import urllib.request
from typing import Dict
from urllib.parse import urljoin, urlparse
import requests
class APIClient:
"""HTTP API client."""
def __init__(self, base_url: str):
self.base_url = base_url
def get_insecure(self, endpoint: str) -> Dict:
"""TP: SSL verification disabled."""
url = urljoin(self.base_url, endpoint)
response = requests.get(url, verify=False, timeout=30)
return response.json()
def get_secure(self, endpoint: str) -> Dict:
"""FP: Default SSL verification."""
url = urljoin(self.base_url, endpoint)
response = requests.get(url, verify=True, timeout=30)
return response.json()
def get_no_timeout(self, endpoint: str) -> Dict:
"""TP: No timeout specified."""
url = urljoin(self.base_url, endpoint)
response = requests.get(url) # No timeout!
return response.json()
def get_with_timeout(self, endpoint: str) -> Dict:
"""FP: Proper timeout specified."""
url = urljoin(self.base_url, endpoint)
response = requests.get(url, timeout=30)
return response.json()
class URLFetcher:
"""Fetch URLs."""
def fetch_any_url(self, url: str) -> bytes:
"""TP: Arbitrary URL fetch (SSRF potential)."""
with urllib.request.urlopen(url) as response:
return response.read()
def fetch_https_only(self, url: str) -> bytes:
"""FP: Only HTTPS URLs allowed."""
parsed = urlparse(url)
if parsed.scheme != "https":
raise ValueError("Only HTTPS URLs allowed")
with urllib.request.urlopen(url, timeout=30) as response:
return response.read()
def fetch_allowlisted(self, url: str) -> bytes:
"""FP: Domain allowlist."""
allowed = ["api.example.com", "cdn.example.com"]
parsed = urlparse(url)
if parsed.netloc not in allowed:
raise ValueError("Domain not allowed")
with urllib.request.urlopen(url, timeout=30) as response:
return response.read()
def fetch_unverified_ssl(self, url: str) -> bytes:
"""TP: Unverified SSL context."""
context = ssl._create_unverified_context()
with urllib.request.urlopen(url, context=context) as response:
return response.read()

View File

@@ -0,0 +1,20 @@
"""Production secrets - THIS FILE SHOULD NOT BE IN VERSION CONTROL!
TP: All secrets in this file are real production credentials.
"""
# TP: Real AWS credentials
AWS_PROD_ACCESS_KEY = "AKIAI44QH8DHBPRODKEY"
AWS_PROD_SECRET_KEY = "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbPRODSECRET"
# TP: Real Stripe production key
STRIPE_PROD_SECRET = "sk_live_51HqJK2eZvKYlo2CProdSecretKey123"
# TP: Real GitHub PAT
GITHUB_PROD_PAT = "ghp_ProdTokenaBcDeFgHiJkLmNoPqRsTuVwXyZ12"
# TP: Real SSH private key
SSH_PRIVATE_KEY = """-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAlwAAAAdzc2gtcn
NhAAAAAwEAAQAAAYEA0Z3VS5JJcds3xfn/ygWyF8PbnGy0AHB1x4JLHlLxMIWPqlrRkj17
-----END OPENSSH PRIVATE KEY-----"""

View File

@@ -0,0 +1,192 @@
"""Semgrep-specific patterns module - streamlined version.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual security vulnerability
- FALSE POSITIVE (FP): Flagged but not a real issue in context
- UNCERTAIN: Could be either depending on deployment context
"""
import os
import re
import json
import logging
from typing import Any, Dict, List
from urllib.parse import urlparse
from flask import Flask, request, redirect, make_response, jsonify, send_file
import jwt
import requests
logger = logging.getLogger(__name__)
app = Flask(__name__)
# =============================================================================
# OPEN REDIRECT PATTERNS
# =============================================================================
@app.route("/redirect/unsafe")
def redirect_unsafe():
"""TP: Open redirect - user controls destination URL."""
next_url = request.args.get("next", "/")
return redirect(next_url)
@app.route("/redirect/validated")
def redirect_validated():
"""FP: Redirect with domain validation."""
next_url = request.args.get("next", "/")
parsed = urlparse(next_url)
if parsed.netloc and parsed.netloc != "example.com":
return redirect("/")
return redirect(next_url)
@app.route("/redirect/relative_only")
def redirect_relative():
"""UNCERTAIN: Checks :// but not protocol-relative URLs."""
next_url = request.args.get("next", "/")
if "://" in next_url:
return redirect("/")
return redirect(next_url)
# =============================================================================
# PATH TRAVERSAL PATTERNS
# =============================================================================
@app.route("/files/download")
def download_file():
"""TP: Path traversal via user-controlled filename."""
filename = request.args.get("file", "readme.txt")
filepath = os.path.join("/var/www/files", filename)
return send_file(filepath)
@app.route("/files/safe_download")
def safe_download():
"""FP: Path traversal prevented with realpath check."""
filename = request.args.get("file", "readme.txt")
base_dir = "/var/www/files"
filepath = os.path.join(base_dir, filename)
real_path = os.path.realpath(filepath)
if not real_path.startswith(os.path.realpath(base_dir)):
return "Access denied", 403
return send_file(real_path)
# =============================================================================
# JWT PATTERNS
# =============================================================================
JWT_SECRET = "super_secret_jwt_key_12345" # TP: Hardcoded JWT secret
def verify_jwt_none_allowed(token: str) -> Dict:
"""TP: JWT verification disabled."""
return jwt.decode(token, options={"verify_signature": False})
def verify_jwt_secure(token: str, secret: str) -> Dict:
"""FP: JWT with externally provided secret."""
return jwt.decode(token, secret, algorithms=["HS256"])
# =============================================================================
# SSRF PATTERNS
# =============================================================================
@app.route("/fetch/url")
def fetch_url():
"""TP: SSRF - fetches arbitrary user-provided URL."""
url = request.args.get("url")
response = requests.get(url)
return response.text
@app.route("/fetch/allowlisted")
def fetch_allowlisted():
"""FP: SSRF prevented with domain allowlist."""
url = request.args.get("url")
parsed = urlparse(url)
allowed_hosts = ["api.github.com", "cdn.example.com"]
if parsed.netloc not in allowed_hosts:
return "Domain not allowed", 403
response = requests.get(url)
return response.text
# =============================================================================
# HARDCODED CREDENTIALS PATTERNS
# =============================================================================
# TP: Hardcoded credentials
DATABASE_URL = "postgresql://admin:secretpassword123@db.example.com:5432/prod"
AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
# FP: Placeholder credentials
EXAMPLE_API_KEY = "your_api_key_here"
TEST_DATABASE_URL = "postgresql://test:test@localhost:5432/test_db"
# UNCERTAIN: Test key format but could be real
STRIPE_KEY = "sk_test_4eC39HqLyjWDarjtT1zdp7dc"
# =============================================================================
# COMMAND INJECTION PATTERNS
# =============================================================================
def run_system_command(user_input: str):
"""TP: Command injection via os.system."""
os.system(f"echo {user_input}")
def run_safe_command():
"""FP: Hardcoded command, no user input."""
os.system("date")
# =============================================================================
# INSECURE RANDOM PATTERNS
# =============================================================================
import random
def generate_token_insecure() -> str:
"""TP: Using random module for security token."""
return "".join(random.choices("abcdefghijklmnopqrstuvwxyz0123456789", k=32))
def shuffle_playlist(items: List[str]) -> List[str]:
"""FP: Random for non-security purpose."""
shuffled = items.copy()
random.shuffle(shuffled)
return shuffled
# =============================================================================
# DEBUG MODE PATTERNS
# =============================================================================
DEBUG_MODE = True # TP: Debug flag
@app.route("/debug/eval")
def debug_eval():
"""TP: Debug endpoint with eval."""
if DEBUG_MODE:
expr = request.args.get("expr", "1+1")
return str(eval(expr))
return "Disabled"
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=5001) # TP: Debug mode

View File

@@ -0,0 +1 @@
"""Services module with additional patterns."""

View File

@@ -0,0 +1,39 @@
"""Authentication service - streamlined version."""
import hashlib
import hmac
import secrets
from typing import Optional
class AuthenticationService:
"""Handle user authentication."""
# TP: Hardcoded JWT secret
JWT_SECRET = "hardcoded_jwt_secret_key_2024"
# FP: Default for development only
DEV_SECRET = "development_only_not_production"
def hash_password_insecure(self, password: str) -> str:
"""TP: MD5 for password hashing."""
return hashlib.md5(password.encode()).hexdigest()
def generate_token_secure(self, user_id: int) -> str:
"""FP: Secrets module for token generation."""
token = secrets.token_urlsafe(32)
return f"{user_id}:{token}"
def verify_webhook_signature(self, payload: bytes, signature: str) -> bool:
"""FP: HMAC verification is secure."""
expected = hmac.new(
self.JWT_SECRET.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
def verify_webhook_insecure(self, payload: bytes, signature: str) -> bool:
"""TP: Using == for signature comparison (timing attack)."""
expected = hmac.new(
self.JWT_SECRET.encode(), payload, hashlib.sha256
).hexdigest()
return expected == signature # Timing attack vulnerable!

View File

@@ -0,0 +1,42 @@
"""File handling service - streamlined version."""
import os
import pickle
import tempfile
from typing import Any
import yaml
class FileService:
"""Handle file operations."""
def load_pickle_user_path(self, filepath: str) -> Any:
"""TP: Pickle from user-controlled path."""
with open(filepath, "rb") as f:
return pickle.load(f)
def load_pickle_fixed_path(self) -> Any:
"""FP: Pickle from known internal path."""
with open("/etc/app/cache.pkl", "rb") as f:
return pickle.load(f)
def save_temp_insecure(self, data: bytes) -> str:
"""TP: Predictable temp file."""
filepath = f"/tmp/data_{os.getpid()}.dat"
with open(filepath, "wb") as f:
f.write(data)
return filepath
def save_temp_secure(self, data: bytes) -> str:
"""FP: Secure temp file creation."""
with tempfile.NamedTemporaryFile(delete=False) as f:
f.write(data)
return f.name
def load_yaml_unsafe(self, yaml_string: str) -> Any:
"""TP: Unsafe YAML loader."""
return yaml.load(yaml_string, Loader=yaml.Loader)
def load_yaml_safe(self, yaml_string: str) -> Any:
"""FP: SafeLoader is secure."""
return yaml.safe_load(yaml_string)

175
src/security_demo/utils.py Normal file
View File

@@ -0,0 +1,175 @@
"""Utilities module - streamlined for Pylint patterns.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual code quality issue
- FALSE POSITIVE (FP): Flagged but acceptable in context
- UNCERTAIN: Depends on coding standards/context
"""
import json
import logging
from typing import Any, Dict, List
# TP: Module-level variable not UPPER_CASE
global_counter = 0
# FP: Constant follows convention
MAX_RETRIES = 3
# =============================================================================
# NAMING CONVENTION PATTERNS
# =============================================================================
def processData(items): # TP: not snake_case
"""Process items."""
return [item * 2 for item in items]
def calculate_total(values): # FP: Proper snake_case
"""Calculate total."""
return sum(values)
class userManager: # TP: not PascalCase
"""Manage users."""
pass
class UserRepository: # FP: Proper PascalCase
"""User repository."""
pass
# =============================================================================
# ARGUMENT PATTERNS
# =============================================================================
def too_many_arguments(a, b, c, d, e, f, g, h, i, j, k):
"""TP: Too many arguments."""
return sum([a, b, c, d, e, f, g, h, i, j, k])
def reasonable_arguments(user_id: int, name: str, email: str) -> dict:
"""FP: Reasonable number of arguments."""
return {"id": user_id, "name": name, "email": email}
# =============================================================================
# DEFAULT ARGUMENT PATTERNS
# =============================================================================
def mutable_default_list(items=[]): # TP: Mutable default
"""TP: Mutable default argument."""
items.append(1)
return items
def safe_default_none(items=None): # FP: Safe None default
"""FP: Safe None default pattern."""
if items is None:
items = []
return items
# =============================================================================
# EXCEPTION HANDLING PATTERNS
# =============================================================================
def bare_except_handler(data):
"""TP: Bare except catches everything."""
try:
return json.loads(data)
except: # TP: bare except
return None
def specific_except_handler(data):
"""FP: Specific exception handling."""
try:
return json.loads(data)
except json.JSONDecodeError:
return None
# =============================================================================
# BUILTIN SHADOWING PATTERNS
# =============================================================================
def shadow_builtins(list, dict): # TP: Shadows builtins
"""TP: Shadows multiple builtins."""
return len(list) + len(dict)
def proper_naming(items: List[int], mapping: Dict) -> int: # FP
"""FP: Descriptive names don't shadow."""
return len(items) + len(mapping)
# =============================================================================
# RETURN STATEMENT PATTERNS
# =============================================================================
def inconsistent_return(value): # TP: Implicit None
"""TP: Some paths return None implicitly."""
if value > 0:
return value
# Implicit None return
def all_paths_return(value): # FP
"""FP: All paths return explicitly."""
if value > 0:
return value
return 0
# =============================================================================
# LOOP PATTERNS
# =============================================================================
def range_len_antipattern(items): # TP: Should use enumerate
"""TP: Should use enumerate."""
result = []
for i in range(len(items)):
result.append((i, items[i]))
return result
def proper_enumerate(items): # FP
"""FP: Proper enumerate usage."""
return [(i, item) for i, item in enumerate(items)]
# =============================================================================
# DOCUMENTATION PATTERNS
# =============================================================================
def function_without_docstring(): # TP: Missing docstring
pass
def function_with_docstring(): # FP
"""This function has a docstring."""
pass
class ClassWithoutDocstring: # TP
pass
class ClassWithDocstring: # FP
"""This class has a docstring."""
pass

View File

@@ -0,0 +1,174 @@
"""Web application module - streamlined version.
FINDING CLASSIFICATIONS:
- TRUE POSITIVE (TP): Actual security vulnerability
- FALSE POSITIVE (FP): Flagged but not a real issue in context
- UNCERTAIN: Could be either depending on deployment context
"""
import os
import subprocess
import pickle
import tempfile
from typing import Any
from flask import Flask, request, render_template_string
import yaml
app = Flask(__name__)
# TP: Hardcoded secret key
app.config["SECRET_KEY"] = "production_secret_key_v2_xK9#mP2$"
# FP: Environment variable with fallback
app.config["DEV_API_KEY"] = os.environ.get("API_KEY", "dev_placeholder_key")
# =============================================================================
# COMMAND INJECTION PATTERNS
# =============================================================================
@app.route("/admin/execute")
def admin_execute():
"""TP: Direct shell injection from user input."""
command = request.args.get("cmd", "whoami")
result = subprocess.call(command, shell=True)
return {"exit_code": result}
@app.route("/build/compile")
def compile_code():
"""FP: Shell=True but command is completely hardcoded."""
result = subprocess.call("make clean && make build", shell=True)
return {"status": "completed", "exit_code": result}
@app.route("/health/disk")
def check_disk():
"""FP: No shell, hardcoded command list."""
result = subprocess.run(["/usr/bin/df", "-h", "/"], capture_output=True, text=True)
return {"disk_usage": result.stdout}
# =============================================================================
# TEMPLATE INJECTION PATTERNS
# =============================================================================
@app.route("/render/custom")
def render_custom():
"""TP: User controls entire template string."""
template = request.args.get("tpl", "{{ 7*7 }}")
return render_template_string(template)
@app.route("/report/generate")
def generate_report():
"""FP: Template hardcoded, only data is dynamic."""
user_name = request.args.get("name", "Anonymous")
REPORT_TEMPLATE = "<h1>Report for {{ name }}</h1>"
return render_template_string(REPORT_TEMPLATE, name=user_name)
# =============================================================================
# DESERIALIZATION PATTERNS
# =============================================================================
@app.route("/session/load")
def load_session():
"""TP: Pickle load from user-controlled path."""
session_file = request.args.get("file")
with open(session_file, "rb") as f:
data = pickle.load(f)
return {"session": str(data)}
@app.route("/config/load")
def load_config():
"""FP: Pickle from known safe internal path."""
with open("/etc/app/internal_config.pkl", "rb") as f:
config = pickle.load(f)
return {"config_keys": list(config.keys())}
# =============================================================================
# YAML PATTERNS
# =============================================================================
@app.route("/yaml/parse")
def parse_yaml():
"""TP: Unsafe YAML loader with user input."""
yaml_content = request.get_data(as_text=True)
data = yaml.load(yaml_content, Loader=yaml.Loader)
return {"parsed": data}
@app.route("/yaml/safe")
def yaml_safe():
"""FP: SafeLoader is secure."""
yaml_content = request.get_data(as_text=True)
data = yaml.safe_load(yaml_content)
return {"data": data}
# =============================================================================
# TEMP FILE PATTERNS
# =============================================================================
@app.route("/upload/process")
def process_upload():
"""TP: Predictable temp file path."""
data = request.get_data()
filepath = f"/tmp/upload_{os.getpid()}"
with open(filepath, "wb") as f:
f.write(data)
return {"saved_to": filepath}
@app.route("/export/csv")
def export_csv():
"""FP: Uses tempfile module correctly."""
with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
f.write("name,value\n")
return {"file": f.name}
# =============================================================================
# EVAL PATTERNS
# =============================================================================
def eval_user_code(code: str) -> Any:
"""TP: Direct eval of user input."""
return eval(code)
def literal_eval_safe(expr: str) -> Any:
"""FP: ast.literal_eval is safe."""
import ast
return ast.literal_eval(expr)
# =============================================================================
# BINDING PATTERNS
# =============================================================================
def get_production_bind() -> str:
"""TP: Binds to all interfaces."""
return "0.0.0.0"
def get_internal_bind() -> str:
"""FP: Localhost only."""
return "127.0.0.1"
if __name__ == "__main__":
app.run(host=get_production_bind(), debug=True, port=5000)

1
tests/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Tests package."""

25
tests/fixtures.py Normal file
View File

@@ -0,0 +1,25 @@
"""Test fixtures containing mock credentials.
FP: All values are test fixtures, not real credentials.
"""
# FP: Example AWS credentials
TEST_AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
TEST_AWS_SECRET_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
# FP: Mock Stripe keys (sk_test indicates test mode)
MOCK_STRIPE_SECRET = "sk_test_4eC39HqLyjWDarjtT1zdp7dc"
# FP: Example JWT for testing
MOCK_JWT_TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U"
# FP: Test database URL (localhost)
TEST_DATABASE_URL = "postgresql://testuser:testpassword@localhost:5432/testdb"
def get_test_credentials():
"""Return test credentials for unit tests."""
return {
"username": "test_user",
"password": "test_password_123",
}

27
tests/test_demo.py Normal file
View File

@@ -0,0 +1,27 @@
"""Unit tests."""
import pytest
from tests.fixtures import (
TEST_AWS_ACCESS_KEY,
MOCK_STRIPE_SECRET,
get_test_credentials,
MockAuthProvider,
)
def test_placeholder():
"""Placeholder test."""
assert True
def test_mock_auth():
"""Test mock authentication."""
auth = MockAuthProvider()
assert auth.get_token() == "mock_access_token_xyz789"
def test_credentials():
"""Test getting test credentials."""
creds = get_test_credentials()
assert "username" in creds
assert "password" in creds