Cover
Authors
Topics
Status
To read
Book series
None
Average Rating
My Rating
⭐️⭐️⭐️⭐️⭐️
My Review
Number of Pages
510
Private Notes
Publisher
O’REILLY
Read Count
Series #
Year Published
2019
Finish Date
Parent item
Progress
Read pages
65
Sub-item
Total pages
510
Chapter 1
Types of Machine Learning System
Machine Learning systems can also be classified by their ability to learn incrementally from a stream of incoming data.
ML systems can be classified according to the amount and type of supervision they get during training. There are many categories, and usually, it can be comprised to the following:
- Supervised learning,
- unsupervised learning,
- self-supervised learning,
- semi-supervised learning, and
- Reinforcement Learning.
- if a data has the expected label in which the machine will learn from, we called it
supervised learning
. If the label is a categorised data, it is term ‘classification
’ and ifnumeric
, it is regression. - Unlike the supervised ml while the model is exposed to expected output to learn from, the
unsupervised
ml does not have the data label to learn from, and it has to infer from the inherent relationship between the data to form an model for prediction. - Semi-supervised combines the feature of both above to provide a model in which some of the data are labeled and some are not
- Another approach to Machine Learning involves actually generating a fully labeled dataset from a fully unlabeled one. Again, once the whole dataset it labeled, any supervised learning algorithm can be used. This approach is called self-supervised learning.
- Reinforcement learning: This is the process of agent learning about its actions, and rewarded when it is right, otherwise, it is punished. The agent learns the right process itself, and it update this through a process called Policy.
Batch versus Online Learning
Another method of classifying machine learning is either they are Batch or Online Learning
- Batch: This is a learning process in which the machine is trained on available data and deployed. The process works offline and not having a new injection of data. This can be useful if the dataset is of a well established that do not changes often. Example of dog classification can be a well established dataset, but using such a offline dataset to make predictions on stock market might be inaccurate because the sphere of event move fast in this instance, and the dataset of one hour ago might be inaccurate to make predictions for the next few hours to come because the dataset context changes rapidly.
- Online Training: This is a system of feeding the model with an incremental data rather than a batch in which the system learns on increase of a mini-batches. Unlike the batch, with this system, each learning is fast and cheap. This is most useful for dataset that changes rapidly, and very apt when limited on resources. Most appropriate for a huge dataset that cannot fit into a machine learning at once because it allows increamental learning by feeding a mini-batch. One important parameter of online learning is the ultimate ability of how fast they can change with data, and this is influenced by the
learning rate
of the model. The learning rate can be adjusted to high or low, however, an high learning rate tends to learn fast about the data but seems to forget quickly about the old data also because the it will focus more on the newly ingested information. Conversely, low learning rate tends learnings slowly, being inertia, and will less sensitive to new data.
Instance-Based Versus Model-Based Learning
The purpose of machine learning is to predict on instance of unseen data having learned from the seen data. Its ability to generalize makes machine knowledge an ingenious tool.
The prediction process can be either instance or model based.
- Instance-Based Learning: This approach identifies patterns by memorizing training examples and making predictions based on similarity comparisons. When new data is encountered, the model finds the closest matching examples from its training set and uses them to make predictions. For example, in spam detection, if certain words frequently appeared in spam emails during training, the model will flag emails containing similar words. This method works by recognizing similarities between new instances and previously learned examples rather than extracting general rules.
- Model Base Learnings: This involves the model creating a formular to represent the relationship between the dataset base on the inferences of its contexts. From the relationship, an intercept and a constant in linear form is generated if the relationship is linear or polynomial if curve linear.
Additional Note
- Hyper-parameter: This is the length of regularization of the parameters in a model.
- Usually, it is difficult to ascertain the effectiveness of a model until it uses in production, to test this effectiveness, it is often a good practice to divide the available data into two categories; training and test data. The training data result will use to predict the likelihood of the test data. The rate of error of the training data to the test data is term
generalization error
orout of sample error
. This informs us how well the model will perform on unseen data. If the generation error rate is low, it indicates that our model is working correctly, and it can work well much the same at deployment. Conversely, it the rate is high, it means that model is either overfitting or underfitting. - Suppose you find the best hyperparameter value that produces a model with the lowest generalization error—say, just 5% error. You launch this model into production, but unfortunately it does not perform as well as expected and produces 15% errors. What just happened?
- The problem is that you measured the generalization error on the test set, and you adapted the model and hyperparameters to produce the best model for that particular set.
- This means that the model is unlikely to perform as well on new data.
- A common solution to this problem is called holdout validation. You simply hold out part of the training set to evaluate several candidate models and select the best one that perform on validation data set.
- After this holdout validation process, you train the best model on the full training set (including the validation set), and this gives you the final model.
- Lastly, you evaluate this final model on the test set to get an estimate of the generalization error.
‣