**To appear (accepted):** Paolo D'Alberto,
Marco Bodrato, and Alexandru Nicolau *Exploiting
Parallelism in Matrix-Computation Kernels for Symmetric Multiprocessor
Systems: Matrix-Multiplication and Matrix-Addition Algorithm
Optimizations by Software Pipeline and Threads Allocation* in
TOMS,
ACM
press.

**Download:** ~~DVI-paper~~, PDF-paper, PDF-slides, BibTex-entry, software.

**Abstract:** Although Strassen method is not the
asymptotically faster matrix multiplication known, it is the most
widely used for large matrices on finite fields. After his first
paper, some variant have been proposed, with different additive
complexity, here we describe a new one.

The new variant is as
good as those already known for a simple matrix multiplication, but
can save operations either when more than two matrices are to be
multiplied or for squaring. Moreover it can be proved optimal for this
tasks.

The biggest gain is shown for n^{th}-power computation,
in this scenario the additive complexity can be halved, with respect
to original Strassen's.

**Published:** Preprint N.622 Centro
Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata"
(2008)

**Published (full revised version):**
*A
Strassen-like matrix multiplication suited for squaring and higher
power computation* in Proceedings of
the ISSAC
2010 conference, München, Germany, 25-28 July,
2010, ACM press.

**Download:** DVI-paper (29 kB), PDF-paper (265 kB), DjVu-paper (139 kB), PDF-slides (224 kB), TeX-slides (6 kB), DVI-preprint (25 kB), PDF-preprint (247 kB), DjVu-preprint (71 kB), BibTex-entry, software.

**Abstract:** Some hints and tricks to automatically
obtain high degree Toom-Cook implementations, i.e. functions for
integer or polynomial multiplication with a reduced complexity. The
described method generates quite an efficient sequence of operations
and the memory otprint is kept low by using a new strategy: mixing
evaluation, interpolation and recomposition phases. It is possible to
automatise the whole procedure obtaining a general Toom-*n*
function, and to extend the method to polynomials in any
characteristic except two.

**Published:** in E. Antelo, D. Hough and P. Ienne, editors, Proceedings of the 20th IEEE Symposium on Computer Arithmetic, pages 15-22, IEEE, Tübingen, Germany, July 25-27, 2011.

**Download:** ~~PDF-paper (363 kB), PDF-slides (467 K), TeX-slides (- K)~~, software.

**Abstract:**
Toom-Cook strategy is a well-known method for building algorithms to
efficiently multiply dense univariate polynomials. Efficiency of the
algorithm depends on the choice of interpolation points and on the
exact sequence of operations for evaluation and interpolation. If
carefully tuned, it gives the fastest algorithm for a wide range of
inputs. This work smoothly extends the Toom strategy to polynomial
rings, with a focus on
**GF(2)[ x]**. Moreover a method is
proposed to find the faster Toom multiplication algorithm for any
given splitting order. New results found with it, for polynomials in
characteristic 2, are presented.

A new extension for multivariate polynomials is also introduced; through a new definition of density leading Toom strategy to be efficient.

**Published:** In C.Carlet and B.Sunar, editors,
*WAIFI'07
proceedings*, volume 4547 of *LNCS*, pages 116-133. Springer,
Madrid, España, June 21-22, 2007.

**Also known as:** *Searching Optimal Toom-Cook Algorithms for Polynomials in Characteristic 2 and 0* (working title).

**Download:** DVI-paper (38 kB), PDF-paper (320 kB), DjVu-paper (132 kB), PDF-slides (380K), TeX-slides (7K), BibTex-entry, software.

**Abstract:** The use of Toom-Cook sub-quadratic
polynomial multiplication was recently shown to be possible also when
the coefficient field does not have elements enough, particularly for
**𝔽 _{2}[x]**. This paper focus on how Toom's
strategies can be adapted to polynomials on non-binary small
fields. In particular we describe the Toom-3 algorithm for

Algorithms are described with full details, not only the asymptotic complexity is given, but the exact sequence of operations for possibly optimal implementations.

Algorithms given here for

Moreover some connections with FFT-based multiplication are found, slightly improving the worst cases of FFT.

**Published:** Preprint Centro
Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata"
(2007)

**Download:** DVI-paper (32 kB), PDF-paper (307 kB), DjVu-paper (104 kB), ~~BibTex-entry, software~~.

**Abstract:**
In this paper the problem of univariate polynomial evaluation
is considered. When both polynomial coefficients and
the evaluation point

are integers, unbalanced multiplications (one
factor having many more digits than the other one) in classical
Ruffini-Horner rule do not let computations completely benefit of
subquadratic methods, like Karatsuba, Toom-Cook and
Schönhage-Strassen's.

We face this problem by applying an approach originally proposed by
Estrin to augment parallelism exploitation in computation. We show
that it is also effective in the sequential case, whenever data dimensions
grow, e.g. in the long integer case.
We add some adjustments to Estrin's proposal obtaining a smoother
behaviour around corner cases, and to avoid performance degradation when
most of the coefficients are zero.

This way, a new general algorithm is obtained, improving both
theoretical complexity and actual performance. The algorithm itself is
very simple, and its use can be usefully extended to evaluation of
polynomials on rationals or on polynomials (polynomial composition).

Some tests, results and comparisons obtained with PARI/GP are also presented,
for both dense and ``sparse'' polynomials.

**Published:** *Long Integers and Polynomial
Evaluation with Estrin's Scheme* in Proceedings of the
13^{th} SYNASC Symposium, Timişoara,
Romania, September 26-29, 2011

**Download:**~~ PDF-paper (309 kB), DjVu-paper (- kB)~~, PDF-slides (610kB)~~, BibTex-entry, software~~.

**Abstract:**
Karatsuba and Toom-Cook are well-known methods used to multiply
efficiently two long integers. There have been different proposal
about the interpolating values used to determine the matrix to be
inverted and the sequence of operations to invert it. A definitive
word about which is the optimal matrix (values) and the (number of)
basic operations to invert it seems still not to have been said. In
this paper we present some particular examples of useful matrices
and a method to generate automatically, by means of optimised
exhaustive searches on a graph, the best sequence of basic
operations to invert them.

**Published:** Preprint N.605 Centro
Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata"
(2006)

**Published (full revised version):**
*Integer
and Polynomial Multiplication: Towards Optimal Toom-Cook
Matrices* in Proceedings of
the ISSAC
2007 conference, Ontario, Canada, July 29-August 1,
2007, ACM press.

**Download:** PDF-preprint (263 kB), DjVu-preprint (104 kB), PDF-paper (287 kB), PDF-slides (475kB), BibTex-entry, software.

**Abstract:**
In Gröbner bases computation, as in other algorithms in commutative algebra, a
general open question is how to guide the calculations coping with
numerical coefficients and/or not exact input data. It often happens
that, due to error accumulation and/or insufficient working
precision, the obtained result is not one expects from a theoretical
derivation. The resulting basis may have more or less polynomials, a
different number of solution, roots with different multiplicity,
another Hilbert function, and so on. Augmenting precision we may
overcome algorithmic errors, but one does not know in advance how
much this precision should be, and a trial-and-error approach is
often the only way to follow. Coping with initial errors is an even
more difficult task. In this experimental work we propose the
combined use of syzygies and interval arithmetic to decide what to
do at each critical point of the algorithm.

**Published:** Proceedings workshop CASC
2006. V.G. Ganzha, E.W. Mayr, E.V. Vorozhtsov ed. Springer-Verlag
LNCS
4194. Chişinau, Moldova 2006, pp. 64-76, ISBN: 3-540-45182-X

**Download:** PDF-paper (250 kB), DjVu-paper (87 kB), PDF-slides (538K), BibTex-entry,
~~software~~.

**Abstract:**
In Gröbner bases computation a general open question is how
to guide calculations coping with numerical coefficients and/or not exact
input data. It may happen that, due to error accumulation or insufficient
working precision, the result is not one theoretically expects. The basis
may have more or less polynomials, a different number of solutions, a
zero set with wrong multiplicity, and so on. Augmenting precision we
may overcome algorithmic errors, but we don't know in advance how
much it should be, and a trial-and-error approach is often the only way.
Coping with initial errors is an even more difficult task. In this work
the combined use of syzygies and interval arithmetic is proposed as a
technique to decide at each critical point of the algorithm what to do.

**Published: (reduced version)** Proceedings of the
6^{th} SYNASC Symposium. D. Petcu,
D. Zaharie, V. Negru, T. Jebelean ed. MIRTON, Timişoara, Romania 2004,
pp. 77 - 89, ISBN 973-661-441-7

**Published: (full version)** Analele
Universitatii din Timisoara, Vol. XLII, Fasc.special 2,
T. Jebelean, V. Negru, A. Popovici ed., Timişoara, Romania 2004,
pp. 13 - 30, ISSN 1224-970X

**Download:** PDF-paper, BibTex-entry, ~~software~~.

**Abstract:** We improve our proposal of a new variant
of the McEliece cryptosystem based on QC-LDPC codes. The original
McEliece cryptosystem, based on Goppa codes, is still unbroken up to
now, but has two major drawbacks: long key and low transmission rate.
Our variant is based on QC-LDPC codes and is able to overcome such
drawbacks, while avoiding the known attacks. Recently, however, a new
attack has been discovered that can recover the private key with
limited complexity. We show that such attack can be avoided by
changing the form of some constituent matrices, without altering the
remaining system parameters. We also propose another variant that
exhibits an overall increased security level. We analyse the
complexity of the encryption and decryption stages by adopting
efficient algorithms for processing large circulant matrices. The
Toom-Cook algorithm and the short Winograd convolution are considered,
that give a significant speed-up in the cryptosystem operations.

**Published:**
in *Proceedings of the Sixth Conference SCN* ,volume
5229 of *LNCS*, pages 246-262. Springer, Amalfi, Italy,
September 10-12, 2008.

**Download:** DVI-paper (34 kB), PDF-paper (251 kB), DjVu-paper (111 kB), PDF-slides (523K), BibTex-entry, ~~software~~.

**Ask for a copy:** waiting for expiration of
publisher restrictions, ask for a copy of the paper to one of the
authors: Marco Bodrato <>.

Marco Bodrato - 4 novembre 2011