Hive

Archi & Techno

Using Hadoop for Value At Risk calculation Part 6

In the first part, I described the potential interest of using Hadoop for Value At Risk calculation in order to analyze intermediate results. In the three (2,3, 4) next parts I have detailled how to implement the VAR calculation with Hadoop. Then in the fifth part, I have studied how to analyse the intermediate results with Hive. I will finally give you now some performance figures on Hadoop and compare them with GridGain ones. According to those figures, I will detail some performance key points…

Read more
Archi & Techno

Using Hadoop for Value At Risk calculation Part 5

In the first part of this series, I have introduced why Hadoop framework could be useful to compute the VAR and analyze intermediate values. In the second part and third part and fourth part I have given two concrete implementations of VAR calculation with Hadoop with optimizations. Another interest of using Hadoop for Value At Risk calculation is the ability to analyse the intermediate values inside Hadoop through Hive. This is the goal of this (smaller) part of this series.

Read more
Archi & Techno

How to “crunch” your data stored in HDFS?

HDFS stores huge amount of data but storing it is worthless if you cannot analyse it and obtain information. Option #1 : Hadoop : the Map/Reduce engine Hadoop Overview Hadoop is a Map/Reduce framework that works on HDFS or on HBase. The main idea is to decompose a job into several and identical tasks that can be executed closer to the data (on the DataNode). In addition, each task is parallelized : the Map phase. Then all these intermediate results are merged into one result…

Read more