Category Archives: Math

Math: $100 million and 5 million theorems

Source: Wolfram website, Aug 2014

how big is the historical corpus of mathematics? There’ve probably been about 3 million mathematical papers published altogether—or about 100 million pages, growing at a rate of about 2 million pages per year. And in all of these papers, perhaps 5 million distinct theorems have been formally stated.

then what would it take to curate the complete literature of mathematics? In the eCF project, it took about 3 hours of mathematician time to encode each theorem in computable form. But all this work was done by hand, and in a larger-scale project, I am certain that an increasing fraction of it could be done automatically, not least using extensions of our Wolfram|Alpha natural language understanding system.

there are issues like whether theorems stated in papers are actually valid. And even whether theorems that were considered valid, say, 100 years ago are still considered valid today. For example, for continued fractions, there are lots of pre-1950 theorems that were successfully proved in their time, but which ignore branch cuts, and so wouldn’t be considered correct today.

in the end of course it requires lots of actual, skilled mathematicians to guide the curation process, and to encode theorems. But in a sense this kind of mobilization of mathematicians is not completely unfamiliar; it’s something like what was needed when Zentralblatt was started in 1931, or Mathematical Reviews in 1941. (As a curious footnote, the founding editor of both these publications was Otto Neugebauer, who worked just down the hall from me at the Institute for Advanced Study in the early 1980s, but who I had no idea was involved in anything other than decoding Babylonian mathematics until I was doing research for this blog post.)

the whole project is necessarily quite large—perhaps the world’s first example of “big math”. So can the project get done in the world today? A crucial part is that we now have the technical capability to design the language and build the infrastructure that’s needed. But beyond that, the project also needs a strong commitment from the world’s mathematics community—as well as lots of contributions from individual mathematicians from every possible field. And realistically it’s not a project that can be justified on commercial grounds—so the likely $100+ million that it will need will have to come from non-commercial sources.

it’s a great and important project—that promises to be pivotal for pure mathematics. In almost every field there are golden ages when dramatic progress is made. And more often than not, such golden ages are initiated by new methodology and the arrival of new technology. And this is exactly what I think will happen in pure mathematics. If we can mobilize the effort to curate known mathematics and build the system to use and generate computational knowledge around it, then we will not only succeed in preserving and spreading the great heritage of pure mathematics, but we will also thrust pure mathematics into a period of dramatic growth.

Advertisements

Perfect Proofs

Source: Quanta, Mar 2018

these components of a beautiful proof.

  • It can’t be too long;
  • it has to be clear;
  • there has to be a special idea;
  • it might connect things that usually one wouldn’t think of as having any connection.

The book, which has been called “a glimpse of mathematical heaven,” presents proofs of dozens of theorems from number theory, geometry, analysis, combinatorics and graph theory.

mathematicians have come up with at least 196 different proofs of the “quadratic reciprocity” theorem (concerning which numbers in “clock” arithmetics are perfect squares) and nearly 100 proofs of the fundamental theorem of algebra (concerning solutions to polynomial equations). Why do you think mathematicians keep devising new proofs for certain theorems, when they already know the theorems are true?

These are things that are central in mathematics, so it’s important to understand them from many different angles. There are theorems that have several genuinely different proofs, and each proof tells you something different about the theorem and the structures. So, it’s really valuable to explore these proofs to understand how you can go beyond the original statement of the theorem.

An example comes to mind — which is not in our book but is very fundamental — Steinitz’s theorem for polyhedra. This says that if you have a planar graph (a network of vertices and edges in the plane) that stays connected if you remove one or two vertices, then there is a convex polyhedron that has exactly the same connectivity pattern. This is a theorem that has three entirely different types of proof — the “Steinitz-type” proof, the “rubber band” proof and the “circle packing” proof. And each of these three has variations.

Any of the Steinitz-type proofs will tell you not only that there is a polyhedron but also that there’s a polyhedron with integers for the coordinates of the vertices. And the circle packing proof tells you that there’s a polyhedron that has all its edges tangent to a sphere. You don’t get that from the Steinitz-type proof, or the other way around — the circle packing proof will not prove that you can do it with rational coordinates. So, having several proofs leads you to several ways to understand the situation beyond the original basic theorem.

László Lovász’s proof for the Kneser conjecture, which I think we put in the fourth edition. The Kneser conjecture was about a certain type of graph you can construct from the k-element subsets of an n-element set — you construct this graph where the k-element subsets are the vertices, and two k-element sets are connected by an edge if they don’t have any elements in common. And Kneser had asked, in 1955 or ’56, how many colors are required to color all the vertices if vertices that are connected must be different colors.

It’s rather easy to show that you can color this graph with n – k + 2 colors, but the problem was to show that fewer colors won’t do it. And so, it’s a graph coloring problem, but Lovász, in 1978, gave a proof that was a technical tour de force, that used a topological theorem, the Borsuk-Ulam theorem. And it was an amazing surprise — why should this topological tool prove a graph theoretic thing?

This turned into a whole industry of using topological tools to prove discrete mathematics theorems. And now it seems inevitable that you use these, and very natural and straightforward. It’s become routine, in a certain sense. But then, I think, it’s still valuable not to forget the original surprise.

To do these short and surprising proofs, you need a lot of confidence. And one way to get the confidence is if you know the thing is true. If you know that something is true because so-and-so proved it, then you might also dare to say, “What would be the really nice and short and elegant way to establish this?”

No Free Lunch theorem

Multiple sources:

http://ieeexplore.ieee.org/document/585893/ 

A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. 

http://www.statsblogs.com/2014/01/25/machine-learning-lesson-of-the-day-the-no-free-lunch-theorem/

A model is a simplified representation of reality, and the simplifications are made to discard unnecessary detail and allow us to focus on the aspect of reality that we want to understand. These simplifications are grounded on assumptions; these assumptions may hold in some situations, but may not hold in other situations. This implies that a model that explains a certain situation well may fail in another situation. In both statistics and machine learning, we need to check our assumptions before relying on a model.

The “No Free Lunch” theorem states that there is no one model that works best for every problem.

The assumptions of a great model for one problem may not hold for another problem, so it is common in machine learning to try multiple models and find one that works best for a particular problem.

This is especially true in supervised learning; validation or cross-validation is commonly used to assess the predictive accuracies of multiple models of varying complexity to find the best model. A model that works well could also be trained by multiple algorithms – for example, linear regression could be trained by the normal equations or by gradient descent.

Depending on the problem, it is important to assess the trade-offs between speed, accuracy, and complexity of different models and algorithms and find a model that works best for that particular problem.

https://link.springer.com/article/10.1023/A:1021251113462

The no-free-lunch theorem of optimization (NFLT) is an impossibility theorem telling us that a general-purpose, universal optimization strategy is impossible. The only way one strategy can outperform another is if it is specialized to the structure of the specific problem under consideration. Since optimization is a central human activity, an appreciation of the NFLT and its consequences is essential. In this paper, we present a framework for conceptualizing optimization that leads to a simple but rigorous explanation of the NFLT and its implications.

 

 

 

Network Effects – Metcalfe & Reed

Source: A16Z Slideshare, Mar 2016

 

Fullscreen capture 262018 61550 PM.bmp

Using Computers to Explore Mathematics

Source: Quanta, Jan 2018

Schwartz uses computer experiments in much of his work — he’s on the vanguard in that respect. As he explains, computers complement human mathematical thought in several ways: They draw out patterns that provide hints which lead to proofs that might not have been apparent to the mind alone.

I like doing computer experiments, and so I feel that sometimes I have a chance of making progress. The modern computer is a new tool, and I think of these simple things as excuses for data gathering. Like, I’m just going to program the computer and run some experiments and see if I can turn up some hidden patterns that nobody else had seen just because they hadn’t yet done these experiments.

they’re unbelievably good scratch paper. Mathematicians, even the old greats like Gauss and Euler, were trying to gather experimental evidence. They’ll try to work out special cases on paper by hand to give them some idea of what might be happening. In a sense, the computer lets you do a lot more of that. It lets you gather much more experimental evidence about what might be true.

It’s also a visualization tool. It reveals things you’d have no idea would be true without it. A good example where you really need the computer is something like the Mandelbrot set. If you don’t have a computer, you could draw a couple points by hand. But then when people started doing these computer experiments, it just revealed this wealth of information about what’s going on. The Mandelbrot set, Julia sets, and all this stuff which would have been impossible to see without a lot of computation and plotting.

mathematics is extremely good for highly symmetric objects. In a sense, mathematics is about miracles. A recent big example of this is the solution to the Kepler conjecture in eight dimensions by Maryna Viazovska. [The conjecture deals with the densest possible way to pack a collection of spheres together.] It’s curious that the Kepler conjecture in eight dimensions is much easier to solve than the Kepler conjecture in three dimensions. The reason is because there’s this miraculous sphere packing in eight dimensions which is extremely, extremely symmetrical. These special configurations in eight dimensions are like these miraculous things. Whereas without the presence of extraordinary symmetries, in a way, math doesn’t know what to do. So then the computer is very useful, because it lets you search through the possibilities.

I played around with different ways of representing the data. Eventually I got the idea to draw a kind of higher dimensional representation, and suddenly this beautiful pattern emerged. I never would have guessed it.

The computer can gather a large amount of information, it can organize things graphically, it can serve as your external memory, so it can help you recognize these underlying patterns that might be too remote to see unaided. The computer is a shotgun approach, you just try a bunch of stuff. In a way, you’re not using your brain at all, at least initially. But then you get some feedback — there’s something that you see — then you tune the experiment to the feedback you see. And if it’s successful, you really end up knowing something you never would have found on your own.

Game of Life (cellular automata) is Turing-Complete

Source: Stanford University, date indeterminate

IN THE OCTOBER 1970 ISSUE OF SCIENTIFIC AMERICAN, MARTIN GARDNER POPULARIZED CONWAY’S GAME OF LIFE BY DESCRIBING THE GAME’S CAPACITY TO “OPEN UP A WHOLE NEW FIELD OF MATHEMATICAL RESEARCH, THE FIELD OF CELLULAR AUTOMATA.” MANY SOON DISCOVERED THE GREAT POWER OF CELLULAR AUTOMATA AS MODELS OF COMPUTATION. IT WAS DETERMINED THAT CELLULAR AUTOMATA, AS THEY APPEAR IN THE GAME OF LIFE, HAVE THE SAME COMPUTATIONAL CAPACITY AS TURING MACHINES. THE CHURCH-TURING THESIS THAT STATES:

“No method of computation carried out by a mechanical process can be more powerful than a Turing machine.”

THEREFORE, AS THE GAME OF LIFE IS TURING COMPLETE, IT IS ONE OF THE MOST POWERFUL MODELS OF COMPUTATION. IN OTHER WORDS, NO MECHANICAL FORM OF COMPUTATION CAN SOLVE A PROBLEM THAT A TURING MACHINE OR CELLULAR AUTOMATA CANNOT SOLVE, GIVEN SUFFICIENT TIME AND SPACE. THE FOLLOWING REASONS LEAD RESEARCHERS TO DETERMINE THE GAME OF LIFE HAS ALL THE COMPUTATIONAL CAPABILITY OF TURING MACHINES, MEANING IT IS TURING COMPLETE:

THE CELLULAR FORMATION OF A “GLIDER” CAN INTERACT WITH OTHER CELLULAR AUTOMATA TO CONSTRUCT LOGIC GATES LIKE AND, OR, AND NOT. THEREFORE, IT IS POSSIBLE TO BUILD A PATTERN OF CELLULAR AUTOMATA THAT ACTS LIKE A FINITE STATE MACHINE USING TWO COUNTERS. THE FORMATION THAT ACTS LIKE A FINITE STATE MACHINE HAS THE SAME COMPUTATION POWER AS A UNIVERSAL TURING MACHINE, ALSO KNOWN AS BEING TURING COMPLETE. THIS MEANS THAT THE GAME OF LIFE IS AS COMPUTATIONALLY POWERFUL TO ANY COMPUTER WITH UNLIMITED MEMORY AND NO TIME CONSTRAINTS.

PRACTICAL APPLICATIONS

Cellular automata yield a surprisingly diverse range of practical applications. We focus here on one particular example of a real-life application of cellular automata – the design of a public-key cryptosystem.

A public-key cryptosystem is a cryptographic protocol that relies on two keys – an enciphering key E, which is made public, and a deciphering key D, which is kept private. Such a system should contain two important properties:

1) Security. For attackers who have no access to the private key, the protocol should be so designed such that it would take a prohibitively large amount of time to recover the plain text from the cipher text.

2) Signature. The receiver should be able to tell who the sender is. In other words, if person A, say Alice, wants to send a message, she should be able to sign her message in a way that nobody but she could.

Public-Key Crypotography.

In the system, each piece of plain text consists of binary digit blocks of a certain length, say N. In turn, each set of, for example, k bits in the block can be viewed as an element of some ground set S. For instance, we can have a cryptosystem where each block contained 5 bits and each bit takes a value in a field of 2 elements.

Suppose now each block contains N bits and represents m elements of S. To satisfy the security requirement, we require an invertible function which maps Sm to Sm that satisfies the following conditions:

a) Easy to compute (for enciphering)

b) Hard to obtain its inverse without certain key pieces of information which only the receiver has access to (deterring would-be attackers)

The complexity of cellular automata lends itself nicely to applications in cryptography. In particular, cellular automaton rules that are invertible are prime candidates for the invertible function we require to construct a public-key cryptosystem. To succinctly represent the rules whilst preserving a large neighborhood-size, we can associate the ground set S with a mathematical structure such as a field or a ring. This way, addition and multiplication are well defined on S, and we can thus represent cellular automaton rules as polynomial functions.

Here is a sketch of how a cellular automaton cryptosystem might work. Let the ground set be a field. The enciphering key, E, is a composition of inhomogeneous and time-varying linear invertible rules which is made public (the observant reader might note that the fact that the rules are inhomogeneous and time-varying distinguishes the resulting cellular automaton from elementary cellular automata). The deciphering key, D, is the set of individual rules in the composite enciphering function.

The crucial factor that enables such a cellular automaton cryptosystem to work is the fact that it is extremely computationally expensive to unravel the original state from the cipher state without prior knowledge of the individual rules used in the composite enciphering function. This guarantees the security of the system.

Signing a message in our cryptosystem works in exactly the same way as signing a message in the RSA cryptosystem. All the sender has to do is to apply the secret decryption key D to her name and then include that coded signature at the end of her message. This way, when the receiver receives the message, he can simply use the public key E to decipher the coded signature and see if the name matches the presumed sender.

200 Terabyte Math Proof

Source: Nature.com, May 2016

Three computer scientists have announced the largest-ever mathematics proof: a file that comes in at a whopping 200 terabytes1, roughly equivalent to all the digitized text held by the US Library of Congress. The researchers have created a 68-gigabyte compressed version of their solution — which would allow anyone with about 30,000 hours of spare processor time to download, reconstruct and verify it — but a human could never hope to read through it.

Computer-assisted proofs too large to be directly verifiable by humans have become commonplace, and mathematicians are familiar with computers that solve problems in combinatorics — the study of finite discrete structures — by checking through umpteen individual cases. Still, “200 terabytes is unbelievable”, says Ronald Graham, a mathematician at the University of California, San Diego. The previous record-holder is thought to be a 13-gigabyte proof2, published in 2014.

The puzzle that required the 200-terabyte proof, called the Boolean Pythagorean triples problem, has eluded mathematicians for decades. … The problem asks whether it is possible to colour each positive integer either red or blue, so that no trio of integers aband c that satisfy Pythagoras’ famous equation a2 + b2 = c2 are all the same colour. For example, for the Pythagorean triple 3, 4 and 5, if 3 and 5 were coloured blue, 4 would have to be red.

In a paper posted on the arXiv server on 3 May, Heule, Oliver Kullmann of Swansea University, UK, and Victor Marek of the University of Kentucky in Lexington have now shown that there are many allowable ways to colour the integers up to 7,824 — but when you reach 7,825, it is impossible for every Pythagorean triple to be multicoloured1. There are more than 102,300 ways to colour the integers up to 7,825, but the researchers took advantage of symmetries and several techniques from number theory to reduce the total number of possibilities that the computer had to check to just under 1 trillion. It took the team about 2 days running 800 processors in parallel on the University of Texas’s Stampede supercomputer to zip through all the possibilities. The researchers then verified the proof using another computer program.

Although the computer solution has cracked the Boolean Pythagorean triples problem, it hasn’t provided an underlying reason why the colouring is impossible, or explored whether the number 7,825 is meaningful, says Kullmann.

That echoes a common philosophical objection to the value of computer-assisted proofs: they may be correct, but are they really mathematics? If mathematicians’ work is understood to be a quest to increase human understanding of mathematics, rather than to accumulate an ever-larger collection of facts, a solution that rests on theory seems superior to a computer ticking off possibilities.

That did ultimately occur in the case of the 13-gigabyte proof from 2014, which solved a special case of a question called the Erdős discrepancy problem. A year later, mathematician Terence Tao of the University of California, Los Angeles, solved the general problem the old-fashioned way3 — a much more satisfying resolution.