
"A lot of them have bought into an idea that they can use one tool that does everything, and that one tool is the warehouse. For a certain period of time, it actually does work really well until it doesn't. We see the warehouse start to struggle, slow down, can no longer take in as many data sources as it used to without keeping the same latency."
"If I can't get the data from the warehouse, where should I be going? People ask questions, where can I find this data? Very common. Struggle to innovate. A very simple example from my previous job is we had some product engineers want to use data from the warehouse, and the query that they wanted was very simple, SELECT * from some table. That query ran for about five minutes, which in analytics world is bearable, but in operational world is a no-go."
"It's expensive. These warehouses are incredibly powerful, and we're actually really lucky to be able to work with these technologies, but they scale by adding more machines, and those machines are not free. You often have to add very beefy machines to scale your processing. I've had to have a lot of these conversations with customers and clients, and it makes me very sad to tell them that they can't carry on the way that they have."
Many scale-ups and start-ups adopt the data warehouse as a single unified tool for analytics and operational needs. Warehouses can perform well initially but begin to struggle as data sources and operational demands grow, producing increased latency, disorganized tables, and difficulty locating the correct data. Operational use cases demand low-latency queries, which warehouses may fail to deliver (e.g., simple SELECT * queries that take minutes). Scaling warehouses often requires expensive, beefy machines, increasing costs. These limitations impede innovation and force organizations to reconsider relying solely on a single warehouse solution.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]