
Amazon Macie
Managed sensitive data discovery and protection for Amazon S3.
Discover top open-source software, updated regularly with real-world adoption signals.

Scan every data source for PII and secrets instantly
A powerful CLI tool that scans cloud storage, databases, and file systems for PII and secrets, supporting OCR, custom fingerprints, JSON reports, and Slack alerts.
Hawk-eye is a command‑line scanner designed for security teams, DevOps engineers, and compliance auditors who need to locate personally identifiable information and secret credentials across heterogeneous environments. It connects to cloud buckets (S3, GCS), relational and NoSQL databases (MySQL, PostgreSQL, MongoDB, CouchDB, Redis, Firebase), collaboration platforms (Slack, Google Drive) and local file systems, then inspects a wide range of file types—including PDFs, Office documents, images, archives, and video—using text analysis and OCR.
The tool can be installed via pip, run as a Docker container, or imported as a Python library, giving flexibility for CI/CD pipelines or ad‑hoc investigations. Results are emitted as JSON and can be streamed to Slack for real‑time alerts. Advanced users can supply custom fingerprint patterns or adjust connection settings through YAML or inline JSON. While PostgreSQL scanning requires the optional psycopg2-binary package and some Linux distributions need extra graphics libraries, the core experience remains straightforward and scriptable.
When teams consider Hawk Eye, these hosted platforms usually appear on the same shortlist.

Managed sensitive data discovery and protection for Amazon S3.

Data intelligence platform focused on data privacy, security, and governance through sensitive data discovery and classification

Unified trust platform for privacy, consent, data governance, and compliance automation.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
Cloud storage compliance audit
Identify exposed PII in S3 and GCS buckets, generate a JSON report, and notify a Slack channel.
Database secret leakage detection
Scan MySQL, PostgreSQL, and MongoDB for hard‑coded credentials and personal data, then alert the security team.
CI/CD pipeline integration
Run Hawk-eye as a Docker step to fail builds when secrets are found in code or bundled assets.
File system forensics
Recursively scan local directories, archives, and PDFs for hidden PII, producing actionable findings.
Create a connection.yml file with source credentials and mount it into the Docker container or pass it via --connection or --connection-json flags.
Yes, you can supply a custom fingerprint.yml or inline fingerprint JSON to define regexes for additional data types.
Results can be written to a JSON file using the --json flag or printed to stdout; the Python API returns Python objects.
Scanning PostgreSQL requires the psycopg2-binary package, and Red Hat Linux may need mesa-libGL for the cv2 dependency.
Commercial support can be obtained via the project's LinkedIn, Twitter, or Slack community as noted in the README.
Project at a glance
ActiveLast synced 4 days ago