Large Scale Machine Learning (Big Data): Difference between revisions

no edit summary
No edit summary
No edit summary
Line 12: Line 12:


<!--T:4-->
<!--T:4-->
[https://scikit-learn.org/stable/index.html Scikit-learn] is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. This popular package features an intuitive API that makes building fairly complex machine learning pipelines very straightforward. However, many of its implementations of common methods such as GLMs and SVMs assume that the entire training set can be loaded in memory, which might be a showstopper when dealing with massive datasets. Furthermore, some of these algorithms opt for memory-intensive solvers by default. In some cases, you can avoid these limitations using the ideas that follow.
[https://scikit-learn.org/stable/index.html Scikit-learn] is a Python module for machine learning that is built on top of SciPy and distributed under the 3-Clause BSD license. This popular package features an intuitive API that makes building fairly complex machine learning pipelines very straightforward. However, many of its implementations of common methods such as GLMs and SVMs assume that the entire training set can be loaded in memory, which might be a showstopper when dealing with massive datasets. Furthermore, some of these algorithms opt for memory-intensive solvers by default. In some cases, you can avoid these limitations using the ideas that follow.


==Stochastic gradient solvers== <!--T:5-->
==Stochastic gradient solvers== <!--T:5-->
rsnt_translations
56,420

edits