Unit 2: Sparse Matrix
What is a sparse matrix?
A sparse matrix is a special case of a matrix in which the
number of zero elements is much higher than the number of
non-zero elements.
As a rule of thumb, if 2/3 of the total elements in a matrix are
zero’s, it can be called a sparse matrix.
Using a sparse matrix representation — where only the non-
zero values are stored — the space used for representing data
and the time for scanning the matrix are reduced significantly.
Many applications in data science and machine learning involve
sparse matrices, such as:
Natural Language Processing: The occurrence of words in
documents can be represented in a sparse matrix. The words
in a document are only a small subset of words in a language.
If we have a row for every document and a column for every
word, then storing the number of word occurrences in a
document has a high percentage of zero’s in every column.
Recommendation Systems: A sparse matrix can be
employed to represent whether any particular user has
watched any movie. Market Basket Analysis: Since the
number of purchased items is tiny compared to the number of
non-purchased items, a sparse matrix is used to represent all
products and customers.
Unit 2: Sparse Matrix
Numerical example 1
Let's take the example of a movie recommendation system.
There are millions of users and thousands of movies, so it's not
possible for users to watch and rate all movies.
This data can be represented as a matrix where the rows are the
users, and the columns are the movies.
Most of the matrix elements will be empty, where the missing
values will be replaced with zeros.
Since a small percentage of the matrix has non-zero values, this
matrix can be considered a sparse matrix. A small portion of the
data is given below:
Movie Movie Movie Movie Movie Movie Movie
1 2 3 4 5 6 7
User
0 0 0 3 0 0 4
1
User
0 5 0 0 0 0 0
2
User
0 0 5 0 0 4 0
3
User
4 0 0 0 0 0 1
4
User
0 2 0 0 3 0 0
5
The sparsity of this matrix can be calculated by obtaining the ratio
of zero elements to total elements. For this example, sparsity is
calculated as:
Unit 2: Sparse Matrix
Numerical example 2
Another example would be to use a matrix to represent the
occurrence of words in documents.
The term-document matrix dimension will be n×m, where n is
the number of documents and m is the number of words in the
language model.
As a result, most of the matrix elements will be zero since only
non-zero values are important for data analysis.
In addition to a large amount of space used, there will be a
computational time problem because all elements will be
scanned to access non-zero elements.
This process yields a computational complexity problem.
To overcome these problems, we can use different data
structures to represent a sparse matrix. One common
representation format for a sparse matrix
Why to use Sparse Matrix instead of simple matrix?
Storage: There are lesser non-zero elements than zeros and
thus lesser memory can be used to store only those elements.
Computing time: Computing time can be saved by logically
designing a data structure traversing only non-zero elements..
Example:
003 0 4
0 0 570
0 0 000
0 2 600
Representing a sparse matrix by a 2D array leads to wastage of lots
of memory as zeroes in the matrix are of no use in most of the
cases. So, instead of storing zeroes with non-zero elements, we only
Unit 2: Sparse Matrix
store non-zero elements. This means storing non-zero elements
with triples- (Row, Column, value).
Sparse Matrix Representations can be done in many ways following
are two common representations:
1. Array representation
2. Linked list representation
Method 1: Using Arrays:
2D array is used to represent a sparse matrix in which there are
three rows named as
Row: Index of row, where non-zero element is located
Column: Index of column, where non-zero element is located
Value: Value of the non-zero element located at index – (row,
column)
// C++ program for Sparse Matrix Representation using
Array
#include <iostream>
using namespace std;
int main()
{
// Assume 4x5 sparse matrix
int sparseMatrix[4][5] =
{
{0 , 0 , 3 , 0 , 4 },
{0 , 0 , 5 , 7 , 0 },
{0 , 0 , 0 , 0 , 0 },
Unit 2: Sparse Matrix
{0 , 2 , 6 , 0 , 0 }
};
int size = 0;
for (int i = 0; i < 4; i++)
for (int j = 0; j < 5; j++)
if (sparseMatrix[i][j] != 0)
size++;
// number of columns in compactMatrix (size) must be
// equal to number of non - zero elements in sparseMatrix
int compactMatrix[3][size];
// Making of new matrix
int k = 0;
for (int i = 0; i < 4; i++)
for (int j = 0; j < 5; j++)
if (sparseMatrix[i][j] != 0)
{
compactMatrix[0][k] = i;
compactMatrix[1][k] = j;
compactMatrix[2][k] = sparseMatrix[i][j];
k++;
}
for (int i=0; i<3; i++)
{
for (int j=0; j<size; j++)
cout <<" "<< compactMatrix[i][j];
cout <<"\n";
}
return 0;
}
Output
0 0 1 1 3 3
Unit 2: Sparse Matrix
2 4 2 3 1 2
3 4 5 7 2 6
Time Complexity: O(NM), where N is the number of rows in the sparse
matrix, and M is the number of columns in the sparse matrix.
Auxiliary Space: O(NM), where N is the number of rows in the sparse matrix,
and M is the number of columns in the sparse matrix.
Method 2: Using Linked Lists
In linked list, each node has four fields. These four fields are defined
as:
Row: Index of row, where non-zero element is located
Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index –
(row,column)
Next node: Address of the next node
// C++ program for sparse matrix representation.
// Using Link list
#include<iostream>
using namespace std;
// Node class to represent link list
class Node
Unit 2: Sparse Matrix
{
public:
int row;
int col;
int data;
Node *next;
};
// Function to create new node
void create_new_node(Node **p, int row_index, int col_index, int
x)
{
Node *temp = *p;
Node *r;
// If link list is empty then // create first node and assign
value.
if (temp == NULL)
{
temp = new Node();
temp->row = row_index;
temp->col = col_index;
temp->data = x;
temp->next = NULL;
*p = temp;
}
// If link list is already created // then append newly created
node
else
{
while (temp->next != NULL)
temp = temp->next;
r = new Node();
r->row = row_index;
r->col = col_index;
r->data = x;
r->next = NULL;
temp->next = r;
}
Unit 2: Sparse Matrix
}
// Function prints contents of linked list // starting from start
void printList(Node *start)
{
Node *ptr = start;
cout << "row_position:";
while (ptr != NULL)
{
cout << ptr->row << " ";
ptr = ptr->next;
}
cout << endl;
cout << "column_position:";
ptr = start;
while (ptr != NULL)
{
cout << ptr->col << " ";
ptr = ptr->next;
}
cout << endl;
cout << "Value:";
ptr = start;
while (ptr != NULL)
{
cout << ptr->data << " ";
ptr = ptr->next;
}
}
// Driver Code
int main()
{
// 4x5 sparse matrix
int sparseMatrix[4][5] = { { 0 , 0 , 3 , 0 , 4 },
{ 0 , 0 , 5 , 7 , 0 },
{ 0 , 0 , 0 , 0 , 0 },
{ 0 , 2 , 6 , 0 , 0 } };
Unit 2: Sparse Matrix
// Creating head/first node of list as NULL
Node *first = NULL;
for(int i = 0; i < 4; i++)
{
for(int j = 0; j < 5; j++)
{
// Pass only those values which are non - zero
if (sparseMatrix[i][j] != 0)
create_new_node(&first, i, j, sparseMatrix[i]
[j]);
}
}
printList(first);
return 0;
}
4 * 4 Matrix
Matrix 0 1 2 3
1
0 2 0 1 0
1 0 4 0 2
2 3 0 0 0
3 0 0 5 0
7 * 3 Sparse Matrix
Sparse 0 1 2
Matrix1
Ro Col Val
Unit 2: Sparse Matrix
w
0 4 4 6
1 0 0 2
2 0 2 1
3 1 1 4
4 1 3 2
5 2 0 3
6 3 2 5
4 * 4 Matrix
Matrix 0 1 2 3
2
0 0 0 2 0
1 0 0 0 1
2 0 0 0 5
3 0 0 3 0
5 * 3 Sparse Matrix
Sparse 0 1 2
Matrix2
Ro Col Val
w
0 4 4 4
1 0 2 2
2 1 3 1
3 2 3 5
4 3 2 3
Addition of two Sparse Matrix:
Unit 2: Sparse Matrix
Matrix 0 1 2 3
1
0 2 0 1 0
1 0 4 0 2
2 3 0 0 0
3 0 0 5 0
Matrix 0 1 2 3
2
0 0 0 2 0
1 0 0 0 1
2 0 0 0 5
3 0 0 3 0
Matrix3=> Addition of Matrix 1 & 2
Matrix 0 1 2 3
3
0 2 0 3 0
1 0 4 0 3
2 3 0 0 5
3 0 0 8 0
Sparse 0 1 2
Matrix1
Ro Col Val
w
0 4 4 6
1 0 0 2
2 0 2 1
3 1 1 4
4 1 3 2
5 2 0 3
6 3 2 5
Sparse 0 1 2
Unit 2: Sparse Matrix
Matrix2
Ro Col Val
w
0 4 4 4
1 0 2 2
2 1 3 1
3 2 3 5
4 3 2 3
Sparse Matrix 0 1 2
Addition
Ro Col Val
w
0 4 4 7
1 0 0 2
2 0 0 3
3 1 1 4
4 1 3 3
5 2 0 3
6 2 3 5
7 3 2 8
Transpose of Sparse Matrix:
Unit 2: Sparse Matrix
Example:
Matrix 0 1 2
2
0 2 0 1
1 0 4 0
2 3 0 0
3 0 0 5
Sparse 0 1 2
Matrix2
Ro Col Val
w
0 4 3 5
1 0 0 2
2 0 2 1
3 1 1 4
4 2 0 3
5 3 2 5
Interchange Row and Column:
Sparse Matrix2 0 1 2
Interchange
Ro Col Val
w
0 3 4 5
1 0 0 2
2 2 0 1
3 1 1 4
4 0 2 3
5 2 3 5
As we can see we don’t get the same transpose matrix by
just exchanging the row and column.
Unit 2: Sparse Matrix
Transpose Matrix 0 1 2 3
0 2 0 3 0
1 0 4 0 0
2 1 0 0 5
Hence we start sorting the sparse matrix row wise.
Simple Transpose:
Sparse 0 1 2
Matrix2
Ro Col Val
w
0 3 4 5
1 0 0 2
2 0 2 3
3 1 1 4
4 2 0 1
5 2 3 5
This sparse matrix generates the exact transpose
int main()
{
int sparse[10][10],transpose[10][10];
int m,n,p,q,t,col,element;
int i,j;
cout<<"Enter Number of rows and columns";
cin>>m>>n;
t=0;
// assigning the value of matrix
cout<<"\nEnter the matrix:\n";
for(i=1;i<=m;i++)
{
Unit 2: Sparse Matrix
for(j=1;j<=n;j++)
{
cin>>element;
if(element!=0)
{
t=t+1;
sparse[t][1]=i;
sparse[t][2]=j;
sparse[t][3]=element;
}
}
}
cout<<"\n\nThe sparse matrix is :\n\nRow\tColumn\tElement";
// displaying the matrix of non-zero value
cout<<"\n\n"<<m<<"\t"<<n<<"\t"<<t<<"\n\n";
for(i=1;i<=t;i++)
{
cout<<sparse[i][1]<<"\t"<<sparse[i][2]<<"\t"<<sparse[i]
[3]<<"\n";
}
sparse[0][1]=n; sparse[0][2]=m; sparse[0][3]=t;
q=1;
// transpose of the matrix
if(t>0)
{
for(i=1;i<=n;i++)
{
for(j=1;j<=t;j++)
{
if(sparse[j][2]==i)
{
transpose[q][1]=sparse[j][2];
transpose[q][2]=sparse[j][1];
transpose[q][3]=sparse[j][3];
q=q+1;
}
}
}
Unit 2: Sparse Matrix
}
cout<<"\n\nThe transpose of the sparse matrix :\n ";
cout<<"\nRow\tColumn\tElement\n\n";
cout<<sparse[0][1]<<"\t"<<sparse[0][2]<<"\t"<<sparse[0]
[3]<<"\n\n";
for(i=1;i<=t;i++)
{
cout<<transpose[i][1]<<"\t"<<transpose[i][2]<<"\
t"<<transpose[i][3]<<"\n";
}
} //End of Main function
Fast Transpose:
Matrix 0 1 2
3
0 0 3 1
1 0 0 6
2 7 0 0
3 0 2 0
Sparse 0 1 2
Matrix3
Ro Col Val
w
0 3 4 5
1 0 1 3
2 0 2 1
3 1 2 6
4 2 0 7
5 3 1 2
Total [] =Total number of columns
Index of matrix [] =Size of total array + 1
Index[i] =Index [i-1] +Total [i-1]
Unit 2: Sparse Matrix
Total No of non-zero values=5
No of non-zero values in column 1
[0]
No of non-zero values in column 2
[1]
No of non-zero values in column 2
[2]
Index[i] =Index [i-1] +Total [i-1]
Index[0] //Always set to 1 1
Index[1] =Index [0] +Total 2 3
[2]=1+2
Index[2] =Index [1] +Total 4 5
[2]=3+1
Index[3] =Index [2] +Total 6
[2]=4+2
For Fast Transpose:
1. Interchange Row and Column from its Sparse Matrix.
2. Check the Row no by checking its respective Index value.
For ex:
Row Col Interchang Interchang Value
ed Row ed Col
0 1 1 0 3
Now check the value in Interchanged Row. Since its 0 Check value
in
Index [0] =1 so add entry in location 1 of sparse matrix
Interchanged Row Interchanged Col Value
1 0 2
Sparse 0 1 2
Matrix2
Ro Col Val
w
0 3 4 5
Unit 2: Sparse Matrix
1
2 1 0 2
3
4
5
3. Table for fast transpose
Ro Col Interc Interc Value Index of Loc Val in
w hange hang [row] Sparse
d Row ed Matrix
Col
0 1 1 0 3 Index[1] 2
0 2 2 0 1 Index[2] 4
1 2 2 1 6 Index[2] 5
+1
2 0 0 2 7 Index[0] 1
3 1 1 3 2 Index[1] 3
+1
Final Fast Transpose:
Sparse 0 1 2
Matrix2
Ro Col Val
w
0 3 4 5
1 0 2 7
2 1 0 3
3 1 3 2
4 2 0 1
5 2 1 6
Unit 2: Sparse Matrix