DataScience.US
A Data Professionals Community
Browsing Category

Big Data

The growing Hadoop ecosystem

The Apache Hadoop framework, consisting of Hadoop Common, the Hadoop Distributed File System (HDFS), Hadoop YARN, and Hadoop MapReduce, is a core component to most big data projects and to the creation of data lakes. In addition, there are…

7 Online Big Data Courses you must know about

Analytics, big data, and Data Science are hot areas in the industry, and professionals who have these skills are in high demand. Big Data University is the IBM-founded initiative based on the idea that education should be a right, not a…

The Hadoop ecosystem and it’s core technologies

They began a project called Nutch to do this but needed a scalable method to store the content of their indexing. The standard method to organize and store data in 2002 was by means of relational database management systems (RDBMS) which…

Database and Data Management on Hadoop

Since the advent of Google’s BigTable, Hadoop has an interest in the management of data. While there are some relational SQL databases or SQL interfaces to HDFS data, like Hive, much data management in Hadoop uses non SQL techniques to…

Apache Hadoop 3.0.0-alpha2 Released

This is the second alpha release in the 3.0.0 release series leading up to 3.0.0 GA, and incorporates 857 new fixes, improvements, and features since 3.0.0-alpha1 last September. It’s worth reading our previous blog post about 3.0.0-alpha1;…

Understanding data ownership in the data lake

This article was written by Elizabeth Koumpan, Executive Architect, IBM. There is so much talk about data as a new natural resource. The amount of data organizations and citizens across the globe produce, is authored in many systems and…

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

X