Executive Summary
FrontBurner set out to unlock the hidden value of their data. A key enabler for its business was to modernise enterprise data access and enhance analytical capabilities. Elixirr Digital and FrontBurner partnered together to define and execute a multi-phased data and analytics transformation roadmap. The Modern Data Platform then began.
Challenge
FrontBurner uses several systems to operate its business including NCR, Aloha, toast, Microsoft GP, as well as external systems from OpenTable and marketing vitals. While these systems perform their independent tasks well, data availability to create robust consolidated reporting across lines of business and functional areas is challenging. To address current challenges and position the company for future data needs and opportunities FrontBurner was looking for a strategic partner they can work with to address their data and analytics needs.
Approach
During the Foundation Phase, the new FrontBurner the EDH was deployed to the business. The overall solution was well received as it:
- Implemented an EDH following a lake house architecture utilising modern technologies and frameworks
- Simplified and optimised the data management process to reduce data duplication and acquisition/storage inefficiencies
- Redesigned and deployed new data models in the EDH
- Involved a cross-functional business team for requirements, testing, and validation to align requirements and develop consistent standards
AWS-Based Architecture Modernisation & Solution Overview
FrontBurner required a modern architecture that could consolidate its existing disparate warehouses into a single enterprise data hub. Key capabilities were identified and aligned with the following AWS technologies: AWS Glue to ingest data from APIs or static files; Amazon Redshift for data warehouse database; Amazon DynamoDB used for job and database object metadata; Amazon Simple Storage Service (S3) for raw storage for landing and processed data; and Amazon Spectrum used to support data quality testing and raw queries.
Overview:
- Ingestion Engine: This asset is responsible for extracting data assets from source systems. It manages the scheduling and orchestration of the extraction and handles publishing this data for further system consumption.
- Data Lake: A foundational data lake was created using several AWS services. The lake leverages Amazon S3 to store raw flat files pulled from the source systems.
- Transformation Engine: Responsible for transforming raw data into the data warehouse. This handles data cleansing, business rules, aggregations, etc.
- Data Warehouse: This Amazon Redshift based solution follows dimensional modelling techniques to store and deliver relational data to consumers (via Power BI).
DevOps
- Elixirr Digital DevOps practices were implemented from the very beginning following AWS Well-Architected Framework. Dedicated account was provisioned and configured using AWS Control Tower for each deployment environment (development, test and production).
- All resources are defined and deployed using AWS CloudFormation which at the same time provides documented infrastructure definition, allows easy and consistent deployment across all environments and easy disaster recovery.
- AWS services are used for all CICD processes: CodeCommit is used to store CloudFormation templates and all application related scripts and code. Developers commit changes which are picked up by the EventBridge event rule that starts CodePipeline job to build and deploy artifacts automatically using CodeBuild.
- Solution design guideline was to utilise serverless services as much as possible to allow maximum flexibility with resource scaling, deployment and cost management.
- AWS VPC provides fundamental network layer that is shared across all environments greatly cutting down on cost and still providing great resilience.
- AWS S3 buckets are used as cheap, scalable blob storage with encryption and versioning features enabled.
- DynamoDB is used as fast, reliable and very cheap production-ready control database for Glue jobs.
- AWS Glue Jobs, Workflows, Triggers, Databases, Tables and Crawlers are used to orchestrate entire ETL process and hydrate the datalake with fresh, well-structured data multiple times a day.
- Custom containers executing DBT model transformations jobs are scheduled and executed on AWS ECS Fargate with minimal cost and background infrastructure configuration.
- AWS Redshift as production ready data warehouse (lakehouse) is central application component holding data ready and easily accessible for any analytic processes.
- SSM Parameter Store and Secrets Manager are used to securely save parameters and secrets which are easily accessible by any other application component.
- AWS Workspaces are used to provide developers with ready-to-use workstation that allows access to needed resources residing securely in the private network.
- AWS SSO is used to easily manage user access permissions for any required resources following principle of least privilege from central location.
Data Modelling & Transformations
To make data transformations fast, reliable, and easy to update Elixirr Digital leveraged DBT (Data Build Tool). The data warehouse has 4 main layers:
- Dbt sources: These views represent incoming data from the data lake. Objects in this schema will have explicit source names and will have preliminary transformations – renamings, data type changes, etc.
- Dbt staging: These views will be built from dbt sources. Here we see more complex transformation, joins, and the introduction of hashes.
- EDW: These tables represent incoming data from one or more dbt staging views. Objects in this schema will have explicit business names such customer or invoice.
- DM: This schema will typically hold views which join the EDW tables together to produce a subject area such as customer or invoice. We expect reporting software such as PowerBI or Tableau to connect to this layer.
Results and Benefits
Front Burner AWS infrastructure supports high volumes of data from a variety of data sources. The solution provides higher insight into Front Burners data because of the high availability, and it is stored in a unique central location. Furthermore, the solution has improved security and performance with optimised cost.
Using the above mentioned solution, we were able to achieve the following:
- Delivered Highly-structured, organised, scalable, and reusable platform enabling low friction ingestion of future data sources
- End users would normally have to enter the respective data source’s portal and clickthrough multiple pages to finally view their data. A 5-10 minute journey was shorten down to less than 45 seconds.
- Integrated and homogenised data from multiple sources providing conformed enterprise reporting to all business units
- Ingested and migrated all data from disparate sources with no losses. Data is now updated to the most recent value ensuring all reporting is accurate and more real-time. End users now are able to quickly create reports without having to refresh data.
- Automated inbound pipelines and EDH processes to enable daily refreshes providing the business with easy-access insight
- Provide improved analytic insights from transformed data. With the new and improved data model, end users no longer need to filter for certain fields and items – DBT refreshes the model twice a day ensuring any data changes reflected.
- Improve operational efficiency by empowering super users to independently discover and analyse data which improved data visibility by 75% in terms of time of ingestion to business users consuming data.