The algorithm presented here performs in O(n

'Square' matrix multiplication is used here only for explanation. Actually the algorithm can be used for any matrix multiplication of arbitrary input sizes. Precisely, if the two input matrices have sizes n-by-v and v-by-m respectively, then the total number of operations will be at most (3nm + 4v(n+m) + 2m-n+3). In general then, the algorithm does matrix multiplication in linear time with total matrix size. However there is often computation on very large numbers. Precisely, when the largest entry is equal to v, the storage space complexity is O(nm×log(v)).

In this article, the illustration and analysis of correctness and performance of the algorithm are done simultaneously. Note carefully that in the analysis of performance, the addition, subtraction, multiplication, division and modulo(remainder) operations between ANY two numbers are assumed to each take a constant time. Though, given a space complexity of O(f(n)) of the operands to an operation, it will actually take at least O(f(n)) time complexity. But the assumption is made so to emphasise on the basic idea of the matrix multiplication algorithm.

The special case of n=3 will be used all through.

Download the available implementation of the algorithm in C language or in Python3 language.

- The naive algorithm
- The basic idea
- Obtaining the RHS values in quadratic time
- Base n numeral system
- When inputs are natural numbers or zero
- Constraints over a better algorithm
- Links

d e f m n o D = g h i M = p q r j k l s t uLet

Note: differentiate between the entry n in M and the size n of the matrices.

By using the naive algorithm for matrix multiplication, the following is obtained:

d e f m n o (dm+ep+fs) (dn+eq+ft) (do+er+fu) D × M = g h i × p q r = (gm+hp+is) (gn+hq+it) (go+hr+iu) = R j k l s t u (jm+kp+ls) (jn+kq+lt) (jo+kr+lu)

Every element of the matrix R is obtained by doing 2n operations. There are n

By summing the rows of R, we have:

(dm+ep+fs) + (dn+eq+ft) + (do+er+fu) = d(m+n+o) + e(p+q+r) + f(s+t+u) (gm+hp+is) + (gn+hq+it) + (go+hr+iu) = g(m+n+o) + h(p+q+r) + i(s+t+u) (jm+kp+ls) + (jn+kq+lt) + (jo+kr+lu) = j(m+n+o) + k(p+q+r) + l(s+t+u)Also, by summing the columns of R, we have:

(dm+ep+fs) (dn+eq+ft) (do+er+fu) + + + (gm+hp+is) (gn+hq+it) (go+hr+iu) + + + (jm+kp+ls) (jn+kq+lt) (jo+kr+lu) = = = m(d+g+j) n(d+g+j) o(d+g+j) + + + p(e+h+k) q(e+h+k) r(e+h+k) + + + s(f+i+l) t(f+i+l) u(f+i+l)From here on only the case of the rows of R will be considered. However what is to be done is also possible for the case of the columns of R.

Now, consider certain

These coefficients are such that when used in the matrix R we have:

(dm+ep+fs)a + (dn+eq+ft)b + (do+er+fu)c = d(ma+nb+oc) + e(pa+qb+rc) + f(sa+tb+uc) (gm+hp+is)a + (gn+hq+it)b + (go+hr+iu)c = g(ma+nb+oc) + h(pa+qb+rc) + i(sa+tb+uc) (jm+kp+ls)a + (jn+kq+lt)b + (jo+kr+lu)c = j(ma+nb+oc) + k(pa+qb+rc) + l(sa+tb+uc)As will be seen later, it is possible to get all the Right-Hand-Side (RHS) values in O(n

You are in a situation in which you choose three values a, b and c, such that when used with three unknowns x, y and z, you obtain a value k = ax+by+cz. The task is to use your three chosen values a, b and c, together with a

The matrix M is of the form:

m n o M = p q r s t uThe array containing the coefficients {a, b, c} is stored in memory.

Now, consider a vector or array S, of the form:

S = { (ma+nb+oc) , (pa+qb+rc) , (sa+tb+uc) }Every element of the array S is obtained by doing 2n operations. There are n elements. This forms a total of 2n * n = 2n

The matrix D is of the form:

d e f D = g h i j k lThe array S is stored in memory.

Now, consider a vector or array T, of the form:

T = { d(ma+nb+oc) + e(pa+qb+rc) + f(sa+tb+uc) , g(ma+nb+oc) + h(pa+qb+rc) + i(sa+tb+uc) , j(ma+nb+oc) + k(pa+qb+rc) + l(sa+tb+uc) }Every element of the array T is obtained by doing 2n operations. There are n elements. This forms a total of 2n * n = 2n

The array T contains the RHS values. Every element of T will be used so to obtain the n unknowns found in the corresponding row of the result matrix R. The overall number of operations used to obtain T is 2n

In the base n numeral system, any non-negative integer can be correctly and distinctly represented as a sequence of whole numbers, also called digits. For example the number 'five' is represented in base 10 as '5', in base 3 as '12', and in base 2 as '101'. All digits are whole numbers between 0 and n-1 inclusive. As the word 'digit' is usually known to refer to a number between 0 and 9, it will not be used anymore so to avoid confusion.

Let

Given the base n, the sequence s and the size m, the equation below finds I:

I = s

Proof of that there is only one possible representation of a non-negative integer I as a sequence s, under the equation above.

First it should be noted that the largest computation of I with respect to the equation is for the sequence {n-1, n-1, n-1, ... m times}. In this case I evaluates to n

Now let any two sequences {s

I

It turns out that for the above equation to hold, every term must evaluate to zero. Consider the worst case scenario where the term (s

Only the values in brackets can evaluate to zero. Consequently for the equation to hold: s

Given the base n, the number I, and a particular level in s, the equation below finds s

s

where [x] = floor(x) = largest integer smaller than or equal to x.

Consider true that I = s

This represents computing the index I from the sequence s.

Now let y such that (substituting for s

y = ([ I / n

Simplifying the expression,

=> y = [ I / n

After massive cancellation, only the first and last terms remain,

=> y = [ I / n

But I < n

=> y = I - 0

=> y = I

Since y=I, it follows that s

Now the problem initially mentioned can be solved.

Essentially, we have: k = ax+by+cz, where all are non-negative integers.

By analogy, we have: I = x*n

Therefore we have: s={x, y, z}, k=I, a=n

The base 'n' is any integer greater than x, y and z.

(dm+ep+fs)a + (dn+eq+ft)b + (do+er+fu)c = d(ma+nb+oc) + e(pa+qb+rc) + f(sa+tb+uc) (gm+hp+is)a + (gn+hq+it)b + (go+hr+iu)c = g(ma+nb+oc) + h(pa+qb+rc) + i(sa+tb+uc) (jm+kp+ls)a + (jn+kq+lt)b + (jo+kr+lu)c = j(ma+nb+oc) + k(pa+qb+rc) + l(sa+tb+uc)It is assumed that all inputs are natural numbers or zero. So the 'base n numeral system' can be used.

Following from the previous section, the choices of the coefficients a, b and c are:

a = L

b = L

c = L

where L is a number larger than any element of the result matrix R.

Let

It is clear that L, where L = (g

The coefficients a, b and c can now be obtained. Since they are stored in memory they can be computed one after another. That is a is used to compute b, then b is used to compute c. This forms an additional 2n-1 operations.

The array T can now be obtained, taking 4n

However is it really necessary for the elements of T to be essentially 'numbers'...?!

Following from the previous section, an element in T can be regarded as the value I. In this case the corresponding row of the result matrix R will be the sequence s that represents I. If for example:

T[0] = d(ma+nb+oc) + e(pa+qb+rc) + f(sa+tb+uc) = (dm+ep+fs)a + (dn+eq+ft)b + (do+er+fu)c then: (dm+ep+fs) = [ ( T[0] mod b) / a ] (dn+eq+ft) = [ ( T[0] mod c) / b ] (do+er+fu) = [ ( T[0] ) / c ]This forms 3n-1 operations (not 2n-1, in which case the analyses of performances made so far will have a problem!). For all n rows of R, there is a total of (3n-1)*n = 3n

Combining all the operations that have been analised, the overall total number of operations undertaken by the algorithm = 4n

As far as matrix multiplication is concerned, there is to wonder whether the elements of the result matrix R can be any values. If so then, based on the basic idea, there shouldn't be any better algorithm than the one discussed here. And also there shouldn't be any algorithm for the case where real numbers are involved. However, there is to wonder whether the elements of R are not just any values, but values with some relationship with the values of the input matrices D and M, the later which are of course any values. If so then there may be a better algorithm. So at the very least, based on the basic idea, a better algorithm should make use of a characteristic true to matrix multiplication.

Actually there is a characteristic of matrix multiplication. So far only the case of the rows of R was considered. But the case of the columns of R could have been used as well. It turns out that both cases share elements of the result matrix R. That is for example:

T[0] for rows = d(ma+nb+oc) + e(pa+qb+rc) + f(sa+tb+uc) = (dm+ep+fs)a + (dn+eq+ft)b + (do+er+fu)c T[0] for columns = m(da+gb+jc) + p(ea+hb+kc) + s(fa+ib+lc) = (dm+ep+fs)a + (gm+hp+is)b + (jm+kp+ls)cT[0] for both rows and columns share one unknown element of R, which is (dm+ep+fs). This characteristic of matrix multiplication may help to design a better algorithm.