Data Lake: new data store scenario
The Big Data revolution redefined how business data is managed. On the one hand the revolution exponentially increased information by opening up the scope for new analytical skills, but on the other it led to a technological acceleration, driven above all by the open source world, towards tools such as Apache, Hadoop and Spark and technologies that handle vast amounts of data.
The new data-storage scheme which is cloud-based structures data in a scalable way and allows you to carry out an endless amount of experiments on data.
In October 2010, James Dixon, CTO of Pentaho, a business intelligence company, introduced for the first time the term data lake on his blog with the following definition:
If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.
The fundamental difference between traditional storage structures such as data silos or data warehouse and a data lake is that a data warehouse stores data in a structured and rigid way, whereas data lake uses a flat architecture. Each element receives an identifier and a set of metadata to fit it. The data lake can therefore be interrogated in the search of relevant data, which can then be analysed for specific results.
The data lake allows you to overcome the problem of the rigidness of the Data Silos and DWHs design and prevents the proliferation of structured databases. Instead of multiple data sources which manage themselves independently, you can opt for data lake whose use facilitates and speeds up sharing of information while simultaneously promoting the reduction of server-licensed costs.
Eurokleis has a long-standing consulting experience for customers who handle large amounts of data and operate in the telecommunications, ICT and air transport sectors.
Eurokleis has the skills to guide the customer in choosing the best solution through every scenario: Data Silos, ODS (Operational Data Store), DWH, or Data Lake.
The Eurokeis team offers the following services:
- Feasibility studies
- Analysis of requirements and features
- Technological audit
- Design of technical architectures
- Structured database design
- Realisation of data lake
- Change management and knowledge transfer
Digital Business: new data store scenario
In a context where data flows continuously from countless sources, information systems must allow them to archive and make information available at any time necessary.
This requires technological efficiency that eliminates latency between data demand and availability as well as infrastructure efficiency that allows quick reconfiguration to ensure prompt access to data using new real-time variables.
All the Big Data figures
organisations who are unsure of how to get value from the Big Data
believes that defining a strategy for Big Data is one of the main obstacles
states that Big Data skills are rare and expensive
Data Warehouse vs Data Lake
Data Warehouse: Structured data, organised and managed batches of information while applying business rules
Data Lake: Structured, semi structured, non structured and raw data
Data Warehouse: Schema on write
Data Lake: Schema on read
Data Warehouse: Expensive for large quantities of data
Data Lake: Designed to maintain low costs
Data Warehouse: Mature
Data Lake: Maturing
Data Warehouse: Not very agile, initial configuration is difficult to customise
Data Lake: Very agile, possibility to modify and re-modify configuration as needed
Author of the article
You may also be interested in the following articles:
Facilitated Finance: The New Sabatini Law
Access to credit is critical to the company’s competitiveness. New in financial leasing for SMEs and micro-enterprises for the purchase of new machinery, plant and equipment
Business Model Development
A company is only successful if it creates value. Business modeling is therefore essential to effectively manage your business idea.