Main Article Content
Hadoop is an open source framework used to store large data sets ranging in size from gigabytes to petabytes. Traditional methods of data storage will store the data in one huge computer, allowing only one data set to be analyzed at any given time, however with Hadoop architecture, data set clustering is feasible, allowing numerous datasets to be assessed concurrently without interfering with each other. When compared to other data storage systems, Hadoop's data processing speed is quicker and provides a faster response. Hadoop architecture also provides building pieces for numerous services and applications. Spark, Hive, Presto, Hbase, and other popular Hadoop applications.