Large Scale Machine Learning (Big Data): Difference between revisions

Large Scale Machine Learning (Big Data) (view source)

Revision as of 20:19, 28 November 2023

29 bytes removed , 11 months ago

no edit summary

Diane27

rsnt_translations

56,430

edits

@@ Line 59: / Line 59: @@
 <!--T:16-->
-Another option that reduces memory usage even more, is to use [https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html SGDRegressor] instead of <code>Ridge</code>. This class implements many types of Generalized Linear Models for regression, using vanilla Stochastic Gradient Descent as a solver. One caveat of using <code>SGDRegressor</code> is that it only works if the output is 1-dimensional (a scalar).
+Another option that reduces memory usage even more, is to use [https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html SGDRegressor] instead of Ridge. This class implements many types of generalized linear models for regression, using a vanilla stochastic gradient descent as a solver. One caveat of using SGDRegressor is that it only works if the output is unidimensional (a scalar).
 <!--T:17-->
@@ Line 79: / Line 79: @@
 }}
-==Batch Learning== <!--T:21-->
+==Batch learning== <!--T:21-->
 <!--T:22-->
-In cases where your dataset is too large to fit in memory - or just large enough that it does not leave enough memory free for training - it is possible to leave your data on disk and load it in batches during training, similar to how Deep Learning packages work. <code>scikit-learn</code> refers to this as [https://scikit-learn.org/stable/computing/scaling_strategies.html out-of-core learning] and it is a viable option whenever an estimator has the <code>partial_fit</code> [https://scikit-learn.org/stable/computing/scaling_strategies.html?highlight=partial_fit#incremental-learning  method available]. In the examples below, we perform out-of-core learning by iterating over datasets stored on disk.
+In cases where your dataset is too large to fit in memory --or just large enough that it does not leave enough memory free for training-- it is possible to leave your data on disk and load it in batches during training, similar to how deep learning packages work. Scikit-learn refers to this as [https://scikit-learn.org/stable/computing/scaling_strategies.html <i>out-of-core learning</i>] and it is a viable option whenever an estimator has the <code>partial_fit</code> [https://scikit-learn.org/stable/computing/scaling_strategies.html?highlight=partial_fit#incremental-learning  method available]. In the examples below, we perform out-of-core learning by iterating over datasets stored on disk.
 <!--T:23-->

Large Scale Machine Learning (Big Data): Difference between revisions

Large Scale Machine Learning (Big Data) (view source)

Revision as of 20:19, 28 November 2023

Navigation menu

Search