Challenge
Bank of America had a requirement to protect customer data and reduce the exposure of data as much as possible to both internal and external teams. The requirement was to build a production-sized data warehouse without exposing sensitive data to others for testing of internal and external applications. The data needed to be masked/obfuscated and not fabricated in such a way that all names, street numbers and street names remained valid but not associated to a specific customer. Data masking is a method of processing data, sanitising, shuffling production data and creating obfuscated data sets which is then used by other teams to test their major release.
Solution
Elixirr Digital worked with IBM InfoSphere DataStage to automate the data obfuscation of Bank of America’s production data into a QA EDW.
Benefits
The process was able to run on a weekly basis to populate the test warehouse with fresh data. Prior to this solution the data was refreshed quarterly due to the complexity and length of processing time.