Have you ever wondered why we need to study math, calculus, or even algebra to analyze data? We try to escape this subject, but can’t run from it. It follows our path every step. In this article explore why do we need Math in Analytics?
Algebra is a key factor for anything related to data. All machine learning models have the concept of linear algebra in them. In Data Science, Data Analytics, and Business Analytics, other than statistics- linear algebra is a base subject in order to understand machine learning models, word embeddings and Data Representation.
Moreover, Matrices and Vectors are very important components of Data science concepts. Word embeddings is a topic that depends on vectors, whereas matrices are implemented in movies recommending systems and machine learning models.
Is this getting too technical? Well, let us simply start with understanding what is word embeddings?
How do you think a machine understands a human? There has to be some way of communication. Machine learning is created to let the machine understand human words and word connections.
For example, how does a machine know that tomato ketchup is in fact just ketchup and not a name of a cloud? This is where “Word embedding” comes into the picture. It works in the form of neural networks, where words are created into vectors. The computer further understands the vectors and the relationship between vectors.
For example, the words mom and dad would have closer vector relation, whereas mom and ketchup wouldn’t have a close relation. This is how a computer understands words and then this further leads to “Machine Learning”.
When I talk about matrices and movie recommendation systems, let me explain how it actually works.
When you watch a certain movie on Netflix, the Netflix recommendation system shows you movies or series closer to your taste. How does that happen? When you watch a movie, all genres present in the movie are converted into a matrix. This matrix is used further to provide similar matrices on the home page of Netflix. Each movie has its own matrix. Just like that a genre that is not present in the user’s selected film, will never show up on the recommendation system.
In addition to this, when we think about face recognition models or prediction models– all of them have a certain percentage of surety of effectiveness. This is also calculated with the math behind it. All models and codes or even visualizing data have some kind of math behind them.
When we focus on data representation, have you ever wondered how they are made? EDA(Exploratory Data Analysis) is a huge part of today’s analytical world. From business analysis to economic analysis data representation has become a key factor.
For example, the covariance matrix is a matrix that explains correlation, it provides information about the relationship of features. One of the graphical representations for correlation is a heatmap. A covariance matrix is computed with a formula. A covariance matrix is basically a dot product between features.
So how does data representation actually work?
Covariance shows the relationship between variables. A positive correlation means that the change in one variable affects another variable in the same way whereas a negative correlation shows the opposite effect on the other variable.
For example, a scatter plot is also a visual representation that depends on the correlation of variables. Points closer to each other are more correlated than the points further apart.
Even though covariance is a part of statistics, it is also a part of linear algebra where the transpose of the matrix is used. A formula is computed that uses the transpose of a matrix and matrix multiplication.
Exploring beyond data analysis:
Object detection, just like prediction models, also uses linear algebra. Object detection is a part of computer vision. It is the study of extracting information from images.
In order to process an image, pixels are the core substance to be looked at. Techniques like rotating an image, noise reduction, and compressing are performed on the image. And this represents the pixels in the form of matrices, vectors, and tensors.
In addition to that, the transpose of a matrix is when the numbers shift. Similarly when the function ‘rotate’, or ‘flip flop’ is used, the pixels rotate like the numbers in the transpose of a matrix.
Other examples include social media applications such as Snapchat, Instagram, or any filter-based applications that involve different filters with various levels of contrasting and brightening effects. All these functions play around with pixels which automatically relate to the matrix behind these features.
To conclude, we have often wondered why we learn so many mathematical operations in the first place, when we can just automatically and conveniently make models via machine learning or different coding platforms. However, when we use all these models, we learn and understand the math behind these computer models. We can configure why a certain plot or a chart is used. Or what exactly is happening behind a model that for suppose predicts which cloud is cumulus and which cloud is cirrus.