Home / Data Science

Data Science

Discussions and topics related to Data Science/Analytics on-going trends…

Data Modeling in Hive

Data Modeling in Hive Apache Hive is an open source data warehouse infrastructure tool established on top of Hadoop Distributed File System (HDFS). Hive provides data summarizations, querying and ad-hoc analysis of large datasets stored in Hadoop’s HDFS. It process the structured and semi-structured data in Hadoop system. Hive is …

Read More »

YARN Architecture

Apache YARN is one of the core component of Hadoop. YARN (Yet Another Resource Negotiator) is the Resource Management Layer of Hadoop Architecture. It was introduced in Hadoop 2.x. To run and process the data stored in HDFS, YARN allows different data processing engines like Graph Processing Interactive Processing Stream …

Read More »

Watch Dragon ball super