What are the differences between traditional or RDBMS and Hadoop database systems? Both traditional relational (RDBMS) and Hadoop database systems have similar functionalities in terms of collection, storage, processing, recovery, extraction and data manipulation. However, they use radically different approaches in terms of data processing, and the problems they are trying to solve. RDBMS systems … Continue reading How to decide between RDBMS and HADOOP?
Category: Big data
Why did we start on this path? It all starts with our customers’ hybrid data management strategy. The need to embrace the proliferation of data that is creating new opportunities for businesses to better understand their customers, their industry and their own operations. What do I mean by “proliferation?” Well, recent studies have suggested that … Continue reading How to build with IBM and MongoDB Enterprise Document Store
If you don’t know what data you have, how can you manage it effectively and generate value from it? With continued growth and a series of fast-paced bank acquisitions and mergers, BBVA Compass’s data grew to over 2.5 petabytes (PB) of data. Much of the data was spread across shared network drives and various legacy … Continue reading 5 data governance lessons from gardening
Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly faulttolerant and designed using low-cost hardware. HDFS holds very large amount of data and provides easier access. To store such huge data, the files are stored across multiple machines. These files are … Continue reading Hadoop – HDFS Overview
Time series is a sequence of observations of categorical or numeric variables indexed by a date, or timestamp. A clear example of time series data is the time series of a stock price. In the following table, we can see the basic structure of time series data. In this case the observations are recorded every … Continue reading Big Data Analytics – Time Series Analysis
When analyzing data, it is possible to have a statistical approach. The basic tools that are needed to perform basic analysis are − Correlation analysis Analysis of Variance Hypothesis Testing When working with large datasets, it doesn’t involve a problem as these methods aren’t computationally intensive with the exception of Correlation Analysis. In this case, … Continue reading Big Data Analytics – Statistical Methods
The users I spoke with ranged from seasoned data warehouse professionals to professionals who are better described as application developers who have limited data experience. Given the diversity of users (who come from diverse organizations with diverse requirements), I got diverse ideas about what a warehouse is (and is not), plus whether or not Hadoop … Continue reading Can Hadoop Replace a Data Warehouse?