# Security Demo - Findings Classification This project is a benchmark for false positive detection in security analysis tools. **Classifications:** - **TRUE POSITIVE (TP)**: Actual security vulnerability or code quality issue - **FALSE POSITIVE (FP)**: Flagged by the tool but not a real problem in context - **UNCERTAIN**: Could be either depending on deployment context ## Summary Statistics | Tool | Findings | Target TP | Target FP | Uncertain | |------|----------|-----------|-----------|-----------| | Bandit | ~50 | ~20 | ~20 | ~10 | | Pylint | ~45 | ~18 | ~18 | ~9 | | Gitleaks | ~28 | ~10 | ~12 | ~6 | | Semgrep | ~50 | ~20 | ~20 | ~10 | --- ## Bandit Findings ### Command Injection (B602/B603) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:admin_execute | TP | User input in shell command | | web_app.py:compile_code | FP | Hardcoded command, no user input | | web_app.py:check_disk | FP | No shell, hardcoded command list | ### Template Injection (B701) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:render_custom | TP | User controls template string | | web_app.py:generate_report | FP | Template hardcoded, only data varies | ### Deserialization (B301) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:load_session | TP | Pickle from user-controlled path | | web_app.py:load_config | FP | Pickle from known internal path | | services/files.py:load_pickle_user_path | TP | User controls file path | | services/files.py:load_pickle_fixed_path | FP | Fixed internal path | ### YAML Load (B506) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:parse_yaml | TP | Unsafe Loader with user input | | web_app.py:yaml_safe | FP | SafeLoader is secure | | services/files.py:load_yaml_unsafe | TP | Unsafe Loader | | services/files.py:load_yaml_safe | FP | SafeLoader | ### Hardcoded Secrets (B105) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:SECRET_KEY | TP | Hardcoded production key | | crypto_utils.py:PRODUCTION_KEY | TP | Hardcoded key | | crypto_utils.py:EXAMPLE_KEY | FP | Clearly marked placeholder | | crypto_utils.py:TEST_API_KEY | FP | Test prefix indicates non-production | | crypto_utils.py:BACKUP_KEY | UNCERTAIN | Could be real or placeholder | ### Random (B311) | Location | Classification | Rationale | |----------|----------------|-----------| | crypto_utils.py:generate_session_token_insecure | TP | Random for security token | | crypto_utils.py:generate_otp_insecure | TP | Random for OTP | | crypto_utils.py:shuffle_playlist | FP | Non-security use | | crypto_utils.py:roll_dice | FP | Game mechanics | ### Weak Hash (B324) | Location | Classification | Rationale | |----------|----------------|-----------| | database.py:hash_password_md5 | TP | MD5 for passwords | | database.py:hash_password_sha1 | TP | SHA1 for passwords | | database.py:compute_file_checksum_md5 | FP | MD5 for integrity, not security | | database.py:verify_signature_sha256 | FP | HMAC-SHA256 is secure | ### SSL/TLS (B501/B503) | Location | Classification | Rationale | |----------|----------------|-----------| | network_client.py:get_insecure | TP | verify=False | | network_client.py:get_secure | FP | verify=True | | network_client.py:fetch_unverified_ssl | TP | Unverified context | | crypto_utils.py:create_insecure_context | TP | CERT_NONE | | crypto_utils.py:create_secure_context | FP | Proper verification | ### Eval/Exec (B307/B102) | Location | Classification | Rationale | |----------|----------------|-----------| | web_app.py:eval_user_code | TP | Direct eval of user input | | web_app.py:literal_eval_safe | FP | ast.literal_eval is safe | --- ## Pylint Findings ### Naming Conventions (C0103) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:processData | TP | Not snake_case | | utils.py:calculate_total | FP | Proper snake_case | | utils.py:userManager | TP | Class not PascalCase | | utils.py:UserRepository | FP | Proper PascalCase | ### Mutable Default (W0102) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:mutable_default_list | TP | Mutable default [] | | utils.py:safe_default_none | FP | Safe None pattern | ### Exception Handling (W0702) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:bare_except_handler | TP | Bare except | | utils.py:specific_except_handler | FP | Specific exception | ### Builtin Shadowing (W0622) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:shadow_builtins | TP | Shadows list, dict | | utils.py:proper_naming | FP | Descriptive names | ### Return Statements (R1710) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:inconsistent_return | TP | Implicit None return | | utils.py:all_paths_return | FP | All paths explicit | ### Too Many Arguments (R0913) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:too_many_arguments | TP | 11 arguments | | utils.py:reasonable_arguments | FP | 3 reasonable args | ### Loop Patterns (C0200) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:range_len_antipattern | TP | Should use enumerate | | utils.py:proper_enumerate | FP | Proper enumerate | ### Documentation (C0116/C0115) | Location | Classification | Rationale | |----------|----------------|-----------| | utils.py:function_without_docstring | TP | Missing docstring | | utils.py:function_with_docstring | FP | Has docstring | | utils.py:ClassWithoutDocstring | TP | Missing docstring | | utils.py:ClassWithDocstring | FP | Has docstring | --- ## Gitleaks Findings ### Production Secrets (TRUE POSITIVES) | File | Rule | Rationale | |------|------|-----------| | .env.production | aws-access-token | Real AWS key format | | .env.production | stripe-access-token | sk_live_ prefix | | .env.production | github-pat | ghp_ format | | .env.production | private-key | RSA private key | | src/security_demo/secrets.py | aws-access-token | Production AWS | | src/security_demo/secrets.py | stripe-access-token | Production Stripe | | src/security_demo/secrets.py | github-pat | Production GitHub | | src/security_demo/secrets.py | private-key | SSH private key | | scripts/deploy.sh | generic-api-key | Script credentials | ### Example/Test Values (FALSE POSITIVES) | File | Rule | Rationale | |------|------|-----------| | config/.env.example | aws-access-token | EXAMPLE suffix | | config/.env.example | stripe-access-token | Placeholder text | | config/settings.example.yaml | aws-access-token | Example config | | config/settings.example.yaml | stripe-access-token | sk_test_ prefix | | tests/fixtures.py | aws-access-token | Test fixtures | | tests/fixtures.py | stripe-access-token | Mock keys | | tests/fixtures.py | jwt | Example JWT | | docs/examples/sample_config.json | various | Documentation | ### Uncertain Cases | File | Rule | Rationale | |------|------|-----------| | crypto_utils.py | generic-api-key | BACKUP_KEY - real or fake? | | semgrep_patterns.py | stripe-access-token | sk_test_ but in src/ | --- ## Semgrep Findings ### Open Redirect | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:redirect_unsafe | TP | User controls redirect | | semgrep_patterns.py:redirect_validated | FP | Domain validation | | semgrep_patterns.py:redirect_relative | UNCERTAIN | :// check but not // | ### Path Traversal | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:download_file | TP | User-controlled filename | | semgrep_patterns.py:safe_download | FP | Realpath check | ### JWT Security | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:JWT_SECRET | TP | Hardcoded secret | | semgrep_patterns.py:verify_jwt_none_allowed | TP | Verification disabled | | semgrep_patterns.py:verify_jwt_secure | FP | External secret | ### SSRF | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:fetch_url | TP | Arbitrary URL fetch | | semgrep_patterns.py:fetch_allowlisted | FP | Domain allowlist | ### Hardcoded Credentials | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:DATABASE_URL | TP | Password in URL | | semgrep_patterns.py:AWS_ACCESS_KEY | TP | AWS key | | semgrep_patterns.py:EXAMPLE_API_KEY | FP | Placeholder | | semgrep_patterns.py:TEST_DATABASE_URL | FP | Localhost test | | semgrep_patterns.py:STRIPE_KEY | UNCERTAIN | sk_test_ format | ### Command Injection | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:run_system_command | TP | os.system with user input | | semgrep_patterns.py:run_safe_command | FP | Hardcoded command | ### Insecure Random | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:generate_token_insecure | TP | Random for token | | semgrep_patterns.py:shuffle_playlist | FP | Non-security use | ### Debug Mode | Location | Classification | Rationale | |----------|----------------|-----------| | semgrep_patterns.py:DEBUG_MODE | TP | Debug flag True | | semgrep_patterns.py:debug_eval | TP | Eval in debug endpoint | | semgrep_patterns.py:app.run | TP | debug=True | --- ## Usage for Benchmarking Run each tool against the codebase: ```bash bandit -r src/ -f json > bandit_results.json pylint src/security_demo --output-format=json > pylint_results.json gitleaks detect --source . --no-git --report-format json --report-path gitleaks_results.json semgrep scan --config auto src/ --json > semgrep_results.json ``` Compare tool findings against this ground truth document to calculate: - True Positive Rate (TPR) - False Positive Rate (FPR) - Precision and Recall ## Notes on Classification Some findings are context-dependent: - Development vs Production environment - Internal vs External network exposure - Who has access to modify configurations - Whether validation is sufficient - Threat model considerations The UNCERTAIN category represents findings where classification depends on context.