MIT Press has made their Deep Learning
textbook available and downloadable for free. The authors, Goodfellow, Bengio
and Courville, have a chapter devoted to reviewing linear arithmetic, which is
vital to understanding deep learning. This is my one-pager on what one needs
to know.
A.I. loves data! Mounds of it. Data, data, data, … except it is better to
have huge rows of data and not so much columns of data. The curse of
dimensionality means that the more features \(x_0, x_1, …, x_n\) you have
in your dataset, the more problems you will have processing them.
It is better some days to keep development environments separate from a
workstation’s environment. Language versions, kubernetes configuration files,
among other things can get pretty borked up. Developing in a container
environment keeps the host workstation clean.
I’m heavily indebted to the Pluralsight Go courses and to Jon Bodner’s
Learning Go: An Idiomatic Approach to Real-World Go Programming. This is
not a complete cheatsheet. For example, I won’t mention small stuff like
unused constants causing a compile-time error. The full Go language
specification is located at https://go.dev/ref/spec.
Coding for the Internet in 2023 means Microservice architecture, which means
containers and container orchestration. To assist with the best practices,
Adam Wiggins from Heroku published his team’s developers guidelines, called
The Twelve Factor App. According to Wikipedia, Nginx extended the principles,
and O’Reilly added their own two cents. However, I consider the original
principles a solid beginning.
Classification is the supervised attempt to determine if something belongs to
one group or another. For example, is the email I received five minutes ago a
phishing attempt or is it legit? Is the family at 123 Dunno Court,
Ste-Clotilde-de-Rubber-Boot more likely to vote Bloc, Liberal or PC?
The problem with OLS is that it fits training data possibly too well. The test
data may have a pretty awful \(R^2\), which screams overfitting. This post,
taken from the MMAI 863 course at Smith, details some ways to get the training
data fit closer to the test data.