Data Siloes

The value in centralised and connected data

  • Siloed data refers to when data sources are not able to speak to each other.
  • There is an identified need to extract the value held in each of these systems. (Extract and load)
  • It is recognised that by connecting that data together, its value is exponentially increased (Transform)

Data siloes make it difficult for business intelligence to have one view on the data, for example an eCommerce company wants to view the orders containing a certain product during a particular month in the year compared against the marketing metrics for the campaigns related to that product. There is now a cross between the eCommerce and marketing functions within that business.

The problem that presents itself is that marketing obtain all their data from the front-end reports generated by the performance marketing channels used, which are then exported into Excel files aggregated by each performance marketing channel on a monthly basis. However, eCommerce are viewing data using only the front-end reports generated by, say, Shopify on a daily basis by product. There is now a distinction between the granularity of data used, as well as how the data is handled and it will take time to reconcile the two departments together. This ignores the fact that other teams in that business, such as product managers, may want data on eCommerce sales, where now there would need to be another reconciliation between only product and eCommerce.


215

Data siloes form when there is no centralised location for extracted data to be housed.


Without a centralised data warehouse, a common solution is to use lots of spreadsheets to house data, then create analytics in the spreadsheet software, or export it to a business Intelligence (BI) tool.

The compute power of the BI tool via these methods becomes strained and ultimately does not scale, since they are not designed to handle large volumes of data ingestion.


297

A decentralised data system means data across systems is tricky to combine and draw insights from.


Data warehouses are designed to house large volumes of data and transform them at speed, which is largely due to the design of them running purely on the cloud. Therefore, a solution to data siloes would be to centralise all data into a single source of truth.