Challenge
iRobot is focused towards building a next generation data platform. Millions of robots across the globe generate enormous amounts of telemetry and sensor data needed across multiple departments to provide better customer service, engineer better robots, and identify new products and services. Additional data is needed to understand mobile application interactions, the manufacturing pipeline, and customer service incidents. Due to the volume, variety, and velocity of these different sources, a well architected data platform was needed that ensured security, stability, and scalability. Moreover, with global data sovereignty and PII regulations, iRobot needed a solution to protect and segregate data assets appropriately.
Solution
Elixirr Digital inherited an architecture that was not able to scale and was constantly failing on basic data ingestion. We quickly took an inventory of assets, designed a roadmap, and began implementing the next generation framework. We opted to utilise more AWS managed services which by nature are more fault tolerant and scalable. Spark jobs were replaced with Lambda, Firehose, and Data Pipeline. An expensive NoSQL database was replaced by ElastiCache and Athena. Solutions were introduced to eliminate data loss further upstream in the ingestion. In Redshift, we implemented proper dimensional models with appropriate compression, sorting, and distribution.
Benefits
iRobot’s data platform can now support any workload regardless of volume, velocity, and variety of data. Infrastructure costs are significantly lower and business users have more trust in their data. iRobot is now exploring ways to leverage these data assets outside the scope of cleaning floors.