Sunday, 7 July 2024

Learning - Linear Algebra

Asked with the Gemini and Chat GPT - asked queries and noted here. :)


Linear Algebra 

- is the branch of mathematics that deals with the vector spaces and the linear transformations.  

- what is vector space? A vector space is a collection of vectors where specific operations like addition and scalar multiplication (multiplying a vector by a number) can be performed.


- what is a vector? Vectors represent quantities that have both size and direction. An example can be force which requires the strength and the direction.

  • Vectors can be added together and multiplied by scalars.
  • Vector addition: a+b=(a1a2)+(b1b2)=(a1+b1a2+b2)\mathbf{a} + \mathbf{b} = \begin{pmatrix} a_1 \\ a_2 \end{pmatrix} + \begin{pmatrix} b_1 \\ b_2 \end{pmatrix} = \begin{pmatrix} a_1 + b_1 \\ a_2 + b_2 \end{pmatrix}
  • Scalar multiplication: cv=c(v1v2)=(cv1cv2)c \mathbf{v} = c \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} c v_1 \\ c v_2 \end{pmatrix}
  • Represented graphically by arrows in space, with the direction and length corresponding to the vector's direction and magnitude.

  • - what are linear transformations? Functions that take one vector space and maps it to another, preserving the linear relationships between vectors. It is a transformation that stretches, shrinks, or rotates vectors Matrices represents linear transformations.


    - what are matrices? Matrices are rectangular arrays of numbers that represents linear transformations, solve systems of linear equations, and store data. They are grids like structure where operations on rows and columns to manipulate vectors can be performed. 

    A matrix with mm rows and nn columns is called an m×nm \times n matrix (read as "m by n matrix"). It is typically written in the form:

    A=(a11a12a1na21a22a2nam1am2amn)A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix}

    where aija_{ij} represents the element in the ii-th row and jj-th column of the matrix.


    Operations involving Vectors and Scalars

    • Dot Product: The dot product (or scalar product) of two vectors a\mathbf{a} and b\mathbf{b} is a scalar defined as: ab=a1b1+a2b2++anbn\mathbf{a} \cdot \mathbf{b} = a_1 b_1 + a_2 b_2 + \ldots + a_n b_n
    • Cross Product: The cross product of two vectors in R3\mathbb{R}^3 results in another vector perpendicular to both: a×b=ijka1a2a3b1b2b3\mathbf{a} \times \mathbf{b} = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \end{vmatrix}
    • Norm (Magnitude): The norm (or length) of a vector v\mathbf{v} is given by: v=v12+v22++vn2\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + \ldots + v_n^2}

    Matrix Operations

    1. Addition: Two matrices of the same dimension can be added by adding their corresponding elements.

      A+B=(a11+b11a12+b12a1n+b1na21+b21a22+b22a2n+b2nam1+bm1am2+bm2amn+bmn)A + B = \begin{pmatrix} a_{11} + b_{11} & a_{12} + b_{12} & \cdots & a_{1n} + b_{1n} \\ a_{21} + b_{21} & a_{22} + b_{22} & \cdots & a_{2n} + b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} + b_{m1} & a_{m2} + b_{m2} & \cdots & a_{mn} + b_{mn} \end{pmatrix}
    2. Scalar Multiplication: A matrix can be multiplied by a scalar by multiplying each element of the matrix by the scalar.

      cA=(ca11ca12ca1nca21ca22ca2ncam1cam2camn)cA = \begin{pmatrix} c a_{11} & c a_{12} & \cdots & c a_{1n} \\ c a_{21} & c a_{22} & \cdots & c a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ c a_{m1} & c a_{m2} & \cdots & c a_{mn} \end{pmatrix}
    3. Matrix Multiplication: Two matrices AA (of dimension m×nm \times n) and BB (of dimension n×pn \times p) can be multiplied to form a matrix C=ABC = AB (of dimension m×pm \times p).

      cij=k=1naikbkjc_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj} 

      Each element cijc_{ij} of the resulting matrix CC is the dot product of the ii-th row of AA and the jj-th column of BB.

    4. Matrix Multiplication: Two matrices AA  (of dimension m×nm \times n) and BB (of dimension n×pn \times p) can be multiplied to form a matrix C=ABC = AB (of dimension m×pm \times p).

      cij=k=1naikbkj​

      Each element cijc_{ij} of the resulting matrix CC is the dot product of the ii-th row of AA and the jj-th column of BB.



    Linear Algebra in Machine Learning

    1. Data Representation

    • Vectors and Matrices: Data is often represented as vectors (1D arrays) or matrices (2D arrays). For instance, a dataset with mm samples and nn features is represented as an m×nm \times n matrix.
    • Tensors: Higher-dimensional arrays, known as tensors, are used for more complex data structures, such as images (3D tensors) or videos (4D tensors).

    2. Model Representation

    • Linear Models: Linear regression, logistic regression, and support vector machines use vectors and matrices to represent coefficients and features. y=Xw+b\mathbf{y} = X\mathbf{w} + \mathbf{b}Here, y\mathbf{y} is the vector of predictions, XX is the matrix of input features, w\mathbf{w} is the weight vector, and b\mathbf{b} is the bias vector.

    3. Transformations and Projections

    • Linear Transformations: Matrices are used to perform linear transformations, such as scaling, rotating, and translating data points in space. These transformations are essential in neural networks and dimensionality reduction techniques.
    • Principal Component Analysis (PCA): PCA is a technique to reduce the dimensionality of data while preserving as much variance as possible. It involves eigenvalue decomposition of the covariance matrix. XTX=VΛVTX^T X = V \Lambda V^Twhere, VV is the matrix of eigenvectors and Λ\Lambda is the diagonal matrix of eigenvalues.

    4. Optimization

    • Gradient Descent: Optimization algorithms like gradient descent rely on linear algebra to update model parameters. The gradient of the loss function with respect to the parameters is computed using vector and matrix operations. θ:=θαJ(θ)\theta := \theta - \alpha \nabla J(\theta)where θ\theta represents the parameters, α\alpha is the learning rate, and J(θ)\nabla J(\theta) is the gradient of the loss function.

    5. Neural Networks

    • Forward Propagation: In neural networks, inputs are transformed through multiple layers using matrix multiplications and non-linear activation functions. a(l+1)=σ(W(l)a(l)+b(l))\mathbf{a}^{(l+1)} = \sigma(W^{(l)} \mathbf{a}^{(l)} + \mathbf{b}^{(l)})where a(l)\mathbf{a}^{(l)} is the activation vector of layer ll, W(l)W^{(l)} is the weight matrix, b(l)\mathbf{b}^{(l)} is the bias vector, and σ\sigma is the activation function.
    • Backpropagation: The backpropagation algorithm for training neural networks involves computing gradients of the loss function with respect to each parameter, which relies heavily on matrix calculus.

    6. Singular Value Decomposition (SVD)

    • Matrix Factorization: Techniques like SVD are used in dimensionality reduction, noise reduction, and data compression. In recommendation systems, SVD helps in decomposing the user-item interaction matrix. A=UΣVTA = U \Sigma V^Twhere AA is the original matrix, UU and VV are orthogonal matrices, and Σ\Sigma is a diagonal matrix of singular values.

    7. Probabilistic Models

    • Multivariate Gaussian Distribution: The covariance matrix of a multivariate Gaussian distribution is a fundamental concept in probabilistic models and Bayesian inference. p(x)=1(2π)k/2Σ1/2exp(12(xμ)TΣ1(xμ))p(\mathbf{x}) = \frac{1}{(2\pi)^{k/2} |\Sigma|^{1/2}} \exp\left(-\frac{1}{2} (\mathbf{x} - \mu)^T \Sigma^{-1} (\mathbf{x} - \mu)\right)where μ\mu is the mean vector and Σ\Sigma is the covariance matrix.

    8. Clustering

    • K-Means Clustering: In K-means clustering, linear algebra is used to calculate distances between points and centroids, update centroids, and minimize the sum of squared distances within clusters.

    Example Applications

    1. Image Processing: Images are represented as matrices of pixel values. Operations like convolution, convolutional neural networks (CNNs), use matrix multiplication.
    2. Natural Language Processing (NLP): Text data is often represented using embeddings, which are matrices that map words to vectors. Matrix operations are used in various NLP models, including transformers.
    3. Recommender Systems: Matrix factorization techniques, such as collaborative filtering, use linear algebra to predict user preferences based on historical data.

    No comments:

    Post a Comment

    Punishments in civil service in Nepal

    Punishments Ordinary Punishment: (1) Censure, (2) Withholding of promotion for up to two years or withholding of a maximum of two salary inc...