Skip to content

Latest commit

 

History

History
173 lines (123 loc) · 6.43 KB

File metadata and controls

173 lines (123 loc) · 6.43 KB

The Redis-based search/lookup solution can be made GDPR compliant when properly implemented with externalized keys, but there are several important considerations:

✅ GDPR-Compliant Aspects:

  1. No Plaintext PII Storage:
  • Redis only stores HMAC hashes, never plaintext PII
  • Original data remains encrypted in the database via Vault
  1. Deterministic Data Erasure:
  • Can delete specific person's records by removing their HMAC hashes
  • Supports "right to be forgotten" requirements
  1. Access Controls:
  • Redis can be secured with authentication/authorization
  • Audit logging for all access patterns

⚠️ Key GDPR Requirements to Address:

  1. Key Management (Your AWS Secrets Approach): // GDPR-compliant key rotation const hmacKey = await getSecretFromAWS('pii-hmac-key-v2'); // Must support key versioning for data migration

  2. Data Minimization:

  • Only index necessary fields for business purposes
  • Implement data retention policies for Redis keys
  1. Pseudonymization Controls: // Proper pseudonymization with secure tokens const secureToken = crypto.randomBytes(32).toString('hex'); // Tokens must be cryptographically secure, not predictable

  2. Additional GDPR Requirements:

  • Data Processing Records: Log what PII fields are being processed
  • Impact Assessments: Document privacy risks of the indexing approach
  • Cross-border Data: Ensure Redis infrastructure compliance if data crosses EU borders
  • Third-party Processors: Vault, AWS, Redis providers must have proper DPAs

🔧 Enhanced GDPR Implementation:

The current system would need:

  1. Key rotation strategy (AWS Secrets Manager)
  2. Data retention policies for Redis keys
  3. Enhanced audit logging for all HMAC operations
  4. Consent management integration
  5. Regular security assessments

The Redis approach is actually more GDPR-friendly than traditional database searches because it provides strong pseudonymization while maintaining search functionality.

DB Approach (Identical to Redis)

say I create a new table in the DB which is identical in its schema to the REDIS approach (key-value) and store identical values as REDIS in this new table, will this new table approach be GDPR complaint as well ?

⏺ Yes, a database table with identical key-value schema storing the same HMAC hashes would be equally GDPR compliant as the Redis approach. The compliance comes from the data structure and cryptographic approach, not the storage technology.

✅ Identical GDPR Compliance:

Database Table Approach: CREATE TABLE pii_search_index ( hmac_key VARCHAR(255) PRIMARY KEY, token_set TEXT, field_type VARCHAR(50), created_at TIMESTAMP, retention_until TIMESTAMP );

🔍 Primary Index Commands:

-- Primary key index (automatically created) -- hmac_key is already the PRIMARY KEY

-- Index for field type filtering (most common query pattern) CREATE INDEX idx_pii_search_field_type ON pii_search_index(field_type);

-- Composite index for field_type + hmac_key lookups (covers most queries) CREATE INDEX idx_pii_search_field_hmac ON pii_search_index(field_type, hmac_key);

-- Index for retention/cleanup operations CREATE INDEX idx_pii_search_retention ON pii_search_index(retention_until);

-- Index for audit/reporting by creation date CREATE INDEX idx_pii_search_created ON pii_search_index(created_at);

📊 Query Pattern Analysis:

Most Common Query: -- This will use idx_pii_search_field_hmac efficiently SELECT token_set FROM pii_search_index WHERE field_type = 'EMAIL' AND hmac_key = 'sha256_hash_here';

Batch Cleanup: -- This will use idx_pii_search_retention efficiently
DELETE FROM pii_search_index WHERE retention_until < NOW();

Field Statistics: -- This will use idx_pii_search_field_type efficiently SELECT field_type, COUNT(*) FROM pii_search_index GROUP BY field_type;

🎯 Performance Optimization:

For High-Volume Systems, consider: -- Partial index for active records only CREATE INDEX idx_pii_search_active ON pii_search_index(field_type, hmac_key) WHERE retention_until > NOW();

-- Index for token pattern searches (if needed) CREATE INDEX idx_pii_search_tokens ON pii_search_index USING GIN(string_to_array(token_set, ','));

-- Same HMAC hashes as Redis INSERT INTO pii_search_index VALUES ('sha256_hmac_of_search_term', 'TKN_123,TKN_456', 'EMAIL', NOW(), '2025-12-31');

vs Redis: HSET pii:search:EMAIL sha256_hmac_of_search_term "TKN_123,TKN_456"

🔍 GDPR Comparison:

GDPR Aspect Database Table Redis
No Plaintext PII ✅ Same HMAC hashes ✅ Same HMAC hashes
Pseudonymization ✅ Identical approach ✅ Identical approach
Data Erasure ✅ DELETE FROM pii_search_index WHERE... ✅ DEL key
Key Management ✅ Same externalized keys ✅ Same externalized keys
Audit Logging ✅ Database audit logs ✅ Redis audit logs
Access Controls ✅ Database permissions ✅ Redis AUTH/ACLs

💡 Additional Benefits of Database Approach:

  1. ACID Compliance:
  • Atomic updates during data erasure
  • Consistent state during key rotation
  1. Better Audit Trail: -- Built-in audit capabilities SELECT * FROM pii_search_index_audit WHERE action = 'DELETE' AND user_id = 'gdpr_request_123';

  2. Data Retention Automation: -- Automatic cleanup based on retention policies DELETE FROM pii_search_index WHERE retention_until < NOW();

  3. Compliance Reporting: -- Generate GDPR compliance reports SELECT field_type, COUNT(*) FROM pii_search_index GROUP BY field_type;

⚠️ Trade-offs:

  • Performance: Database may be slightly slower for high-frequency lookups
  • Scalability: Redis typically handles concurrent reads better
  • Memory: Database uses disk storage vs Redis in-memory

🎯 Recommendation:

Both approaches are equally GDPR compliant. Choose based on:

  • Database approach: Better for compliance auditing, ACID guarantees, retention policies
  • Redis approach: Better for high-performance search, concurrent access

The key insight is that GDPR compliance comes from the cryptographic design (HMAC pseudonymization + secure key management), not the storage technology itself.