Translations:Apache Spark/2/en: Difference between revisions
Jump to navigation
Jump to search
(Importing a new version from external source) |
No edit summary |
||
Line 1: | Line 1: | ||
Apache Spark | Apache Spark is an open source framework for distributed computation initially developed by the AMPLab at Berkeley University and is now a project sponsored by the Apache foundation. Unlike the MapReduce algorithm implemented by Hadoop that uses disk storage, Spark makes use of primitives which are stored in memory, thereby achieving up to 100x the performance of Hadoop in certain applications. Loading data in memory allows them to be queried frequently, making Spark a framework especially appropriate for automated learning and interactive data analysis. |
Latest revision as of 20:50, 7 December 2018
Apache Spark is an open source framework for distributed computation initially developed by the AMPLab at Berkeley University and is now a project sponsored by the Apache foundation. Unlike the MapReduce algorithm implemented by Hadoop that uses disk storage, Spark makes use of primitives which are stored in memory, thereby achieving up to 100x the performance of Hadoop in certain applications. Loading data in memory allows them to be queried frequently, making Spark a framework especially appropriate for automated learning and interactive data analysis.