Category Archives: Algorithm

Google and FB Depend upon Advertising

Source: Visual Capitalist, May 2017

Related Resource: Visual Capitalist, Sep 2017


GE’s Digital Twins

Source: MIT Technology Review, Jul 2017

cloud-hosted software models of GE’s machines that can be used to save money and improve safety for its customers. GE builds these “digital twins”using information it gathers from sensors on the machines, supplemented with physics-based models, AI, data analytics, and knowledge from its scientists and engineers. Though digital twins are primarily lines of software code, the most elaborate versions look like 3-D computer-aided design drawings full of interactive charts, diagrams, and data points. They enable GE to track wear and tear on its aircraft engines, locomotives, gas turbines, and wind turbines using sensor data instead of assumptions or estimates, making it easier to predict when they will need maintenance. An aircraft engine flying over the U.S. could, for instance, have a digital twin on a GE computer server in California help determine the best service schedule for its parts.

The technology depends on artificial intelligence to continually update itself. What’s more, if data is corrupted or missing, the company fills in the gaps with the aid of machine learning, a type of AI that lets computers learn without being explicitly programmed, says Colin Parris, GE Global Research’s vice president for software research. Parris says GE pairs computer vision with deep learning, a type of AI particularly adept at recognizing patterns, and reinforcement learning, another recent advance in AI that enables machines to optimize operations, to enable cameras to find minute cracks on metal turbine blades even when they are dirty and dusty.

Collective Computation @ Santa Fe Institute

Source: Quanta Magazine, Jul 2017

Flack’s focus is on information: specifically, on how groups of different, error-prone actors variously succeed and fail at processing information together. “When I look at biological systems, what I see is that they are collective,” she said. “They are all made up of interacting components with only partly overlapping interests, who are noisy information processors dealing with noisy signals.”

How did you get into research on problem solving in nature, and how did you wind up at the Santa Fe Institute?

I’ve always been interested in how nature solves problems and where patterns come from, and why everything seems so organized despite so many potential conflicts of interest.

Collective computation is about how adaptive systems solve problems. All systems are about extracting energy and doing work, and physical systems in particular are about that. When you move to adaptive systems, you’ve got the additional influence of information processing, which we think allows a system to extract energy more efficiently even though it has to expend a little extra energy to do the information processing. Components of adaptive systems look out at the world, and they try to discover the regularities. It’s a noisy process.

Unlike in computer science where you have a program you have written, which has to produce a desired output, in adaptive systems this is a process that is being refined over evolutionary or learning time. The system produces an output, and it might be a good output for the environment or it might not. And then over time it hopefully gets better and better.

We have this principle of collective computation that seems to involve these two phases. The neurons go out and semi-independently collect information about the noisy input, and that’s like neural crowdsourcing. Then they come together and come to some consensus about what the decision should be. And this principle of information accumulation and consensus applies to some monkey societies also. The monkeys figure out sort of semi-independently who is capable of winning fights, and then they consolidate this information by exchanging special signals. The network of these signals then encodes how much consensus there is in the group about any one individual’s capacity to use force in fights.

Now that you can follow up on these kinds of questions to your heart’s content, what would you say if you could visit yourself back at Cornell, in the stacks of the library?

Jorge Luis Borges is one of my favorite writers, and he wrote something along the lines of “the worst labyrinth is not that intricate form that can trap us forever, but a single and precise straight line.” My path is not a straight line. It has been a quite interesting, labyrinthine path, and I guess I would say not to be afraid of that. You don’t know what you’re going to need, what tools or concepts you’re going to need. The thing is to read broadly and always keep learning.

Can you talk a bit about what it’s like to start with a table of raw data and pull these sorts of grand patterns out of it? Is there a single eureka moment, or just a slow realization?

Typically what happens is, we have some ideas, and our group discusses them, and then over months or years in our group meetings we sort of hash out these issues. We are ok with slow, thoughtful science. We tend to work on problems that are a little bit on the edge of science, and what we are doing is formalizing them. A lot of the discussion is: “What is the core problem, how do we simplify, what are the right measurements, what are the right variables, what is the right way to represent this problem mathematically?” It’s always a combination of the data, these discussions, and the math on the board that leads us to a representation of the problem that gives us traction.

I believe that science sits at the intersection of these three things — the data, the discussions and the math. It is that triangulation — that’s what science is. And true understanding, if there is such a thing, comes only when we can do the translation between these three ways of representing the world.


P vs NP problem (Beyond Computation)

Google and FB Dominate the Digital Advertising Market

Source: Business Insider, Apr 2017

Wieser said that both ad giants captured a combined 77% of gross spending in 2016, an increase from 72% in 2015. Facebook specifically accounted for 77% of the digital ad industry’s overall growth, he noted.

The overall US internet ad industry grew 21.8% from $59.6 billion to $72.5 billion in 2016, according to the IAB.

Related Resource: IAB, Apr 2017

Mobile advertising accounted for more than half (51%) of the record-breaking $72.5 billion spent by advertisers last year, according to the latest IAB Internet Advertising Revenue Report, released today by the Interactive Advertising Bureau (IAB), and prepared by PwC US. The total represents a 22 percent increase, up from $59.6 billion in 2015. Mobile experienced a 77 percent upswing from $20.7 billion the previous year, hitting $36.6 billion in 2016.

Solomonoff Induction

Source: Less Wrong, Jul 2012


1. Algorithms — We’re looking for an algorithm to determine truth.

2. Induction — By “determine truth”, we mean induction.

3. Occam’s Razor — How we judge between many inductive hypotheses.

4. Probability — Probability is what we usually use in induction.

5. The Problem of Priors — Probabilities change with evidence, but where do they start?

The Solution

6. Binary Sequences — Everything can be encoded as binary.

7. All Algorithms — Hypotheses are algorithms. Turing machines describe these.

8. Solomonoff’s Lightsaber — Putting it all together.

9. Formalized Science — From intuition to precision.

10. Approximations — Ongoing work towards practicality.

11. Unresolved Details — Problems, philosophical and mathematical.

Big Data Assumes Past Patterns Apply to the Future

Source: Fast Company, Jan 2017

“What big data is good for,” explains Cathy O’Neil, author of last year’s Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, “is finding patterns of behavior in the past. It will never help us find something that’s completely new.”