A chat with Doug Cutting about Hadoop

le 14/01/2016 par Nelly Grellier
Tags: Data & AI, Évènements

We had the chance to interview Doug Cutting during the Cloudera Sessions in Paris, October 2014. Doug is the creator behind Hadoop and Cloudera's Chief Architect. Here is our exchange below:


A question is: how does it feel to see that Hadoop is actually becoming the must have, the default way of storing and computing over data in large enterprise companies?

Rationally it feels very good. It’s a technology that’s supposed to do that. Emotionally it’s very satisfying, but also I must say I must be very lucky. I was in the right place at the right time and happened to be the person. Someone else would have done this had I not, by now.

Download our white paper "Hadoop Roadmap"

It’s funny because yesterday you were mentioning how Google released that paper about GFS and then about MapReduce, and you seemed surprised that no one else has gone and implemented the paper. How would you describe this, because it was a very big, big task that some people were daunted by taking on or…?

I think, again, I have the right experience from having put some work in open source. I worked on search engines and I could see the value in the technology, I understood the problem, and that combination. And I think I’ve also been in the software business long enough so that’s why I knew what it’d take to build a project that would be useful, that would be used. And I think no one else was positioned ready enough in the competition with that combination of properties. I’ve been able to take advantage of these papers and implement them as open source, and get them out to people. My guess, I don’t know. It wasn’t my plan.

Were you expecting that it would get take such a big, big impact?

No, not at all.

OK, now I guess it’ll be more of a technical question: you mentioned yesterday (I was there yesterday and today) that you know there are all these tools that are coming out, like, building on top of Hadoop and bringing a new technology and a new usage of data - how do you see Hadoop changing, architecturally speaking, to be able to provide even more capabilities in the future?

It’s a very general architecture. It’s in many ways, much like I said, an operating system. An operating system, I’ve been showing, has storage, has a scheduler, has security mechanisms. Already the challenge is to support all kinds of different applications. So I think that the design it has right now is more or less sufficient to permit a very wide range.

Just like operating systems haven’t changed since Windows.

Yeah, not fundamentally. Since the 60’s. Those basic capabilities give you a platform you can develop lots of different applications on that can share the hardware, in a sense. It’s really… Well, a Java OS is sort of “get out of the way” and let applications share the hardware.

Provide abstractions as Jim Baum said.


To deal with complexity.

And so I think that’s a role that Hadoop is filling more and more.

I know it needs a radical re-architecture to do that. Whether people will implement alternate file systems…That might happen, we’ll see.

OK. Thank you so much. And do you see, all these tools, like, you see Kafka for log aggregation across DC, and we see Storm for stream processing, and all these things. Do you see new usages that haven’t come out yet for data? You can search on it, you can index it, you can stream it and process it in real time…

We think there are lots of opportunities for more vertical applications in different industries that are very specific. Things that can process images, tools that can process data… There are lots of different areas where there aren’t tools today. Not to mention verticals like insurance and banking and so on. Some people see commercial offerings and some people see open source offerings. I think right now what people are seeing are more the lower-level tools that can be plugged. I think more and more, higher and higher upper stack will see open-source implementations commoditizing the value of the stack. That’s an ongoing process.


Want to know more about Hadoop and the Hadoop ecosystem? Have a look at our Hadoop map (English version)!

Our Hadoop White Paper and Roadmap are also available for free download (French only).