Data Modeling in Hive Apache Hive is an open source data warehouse infrastructure tool established on top of Hadoop Distributed File System (HDFS). Hive provides data summarizations, querying and ad-hoc analysis of large datasets stored in Hadoop’s HDFS. It process the structured and semi-structured data in Hadoop system. Hive is …
Read More »YARN Architecture
Apache YARN is one of the core component of Hadoop. YARN (Yet Another Resource Negotiator) is the Resource Management Layer of Hadoop Architecture. It was introduced in Hadoop 2.x. To run and process the data stored in HDFS, YARN allows different data processing engines like Graph Processing Interactive Processing Stream …
Read More »