Probability

coin tossI took one course in statistics. I didn't enjoy it, though the ideas in it could have been interesting, the presentation of them was not.

I came across a video by Cassie Kozyrkov that asks "What if I told you I can show you the difference between Bayesian and Frequentist statistics with one single coin toss?" Cassie is a data scientist and statistician. She founded the field of Decision Intelligence at Google, where she serves as Chief Decision Scientist. She has another one of those jobs that didn't exist in my time of making career decisions.

Most of probably had some math teacher use a coin toss to illustrate simple probability. I'm going to toss this quarter. What are the odd that it is heads-up? 50/50. The simple lesson is that even if it has come up tails 6 times in a row the odds for toss 7 is still 50/50.

But after she tosses it and covers it, she asks what is the probability that the coin in my palm is up heads now? She says that the answer you give in that moment is a strong hint about whether you’re inclined towards Bayesian or Frequentist thinking.

The Frequentist: “There’s no probability about it. I may not know the answer, but that doesn’t change the fact that if the coin is heads-up, the probability is 100%, and if the coin is tails-up, the probability is 0%.”

The Bayesian: “For me, the probability is 50% and for you, it’s whatever it is for you.”

Cassie's video about this goes much deeper - too deep for my current interests. However, I am intrigued by the idea that if the parameter may not be a random variable (Frequentist) you can consider your ability to get the right answer, but if you let the parameter be a random variable (Bayesian), there's no longer any notion of right and wrong. She says, "If there’s no such thing as a fixed right answer, there’s no such thing as getting it wrong."

I'll let that hang in the air here for you to consider.



If you do have an interest to go deeper, try:
Frequentist vs Bayesian fight - your questions answered
An 8 minute statistics intro
Statistical Thinking playlist
Controversy about p-values (p as in probabllity)

 

Law of Large Numbers

roulette
Image by Thomas Wolter from Pixabay

A recent episode of the PBS program NOVA took me back to my undergraduate statistics course. It was a course I didn't want to take because I have never been a math person and I assumed that is what the course was about. I was wrong. 

The interesting episode is on probability and prediction and its approach reminded me of the course which also turned out to be surprisingly interesting. Program and course were intended for non-math majors and the producers and professor focused on everyday examples.

I suggest you watch the NOVA episode. You will learn about things that are currently in the news and that you may not have associated with statistics, such as the wisdom of crowds, herd immunity, herd thinking and mob thinking.

For example, the wisdom of crowds is why when a contestant on a Who Wants to Be a Millionaire type of programs asks the audience and out of a few hundred people 85% answer "B," then there's an excllent chance that "B" is the correct answer. And larger samples get more accurate. Why is that?

One of the things I still recall from that class that the program highlighted was the law of large numbers. The law of large numbers states that as a sample size grows, its mean gets closer to the average of the whole population. It was proposed by the 16th century, mathematician Gerolama Cardano but was proven by Swiss mathematician Jakob Bernoulli in 1713.

It works for many situations from the stockmarket to a roulette wheel. I recall that we learned about the "Gambler’s Fallacy." The fallacy is that gamblers don't know enough math, or statistics. They stand by the wheel and see that red has won once and black has now won 5 times in a row. Red is due to win, right? Wrong. The red and black is the same as a coin flip. The odds are always 50/50. The casino knows that. They even list which color and numbers have come up on a screen to encourage you to believe the fallacy.

Flip the coin or spin the wheel 10 times and if could be heads or reds 9 times. Flip or spin 500 times and it will come out to be a lot closer to 50-50.

The "house edge" for American Roulette exists because there is that double zero on the wheel. That gives the house an edge of 2.70%. The edge for European roulette is 5.26%. 

Knowing about probability greatly increases your accuracy in making predictions. And more data makes that accuracy possible.

 

Strong and Weak AI

programming
Image by Gerd Altmann from Pixabay

Ask several people to define artificial intelligence (AI) and you'll get several different definitions. If some of them are tech people and the others are just regular folks, the definitions will vary even more. Some might say that it means human-like robots. You might get the answer that it is the digital assistant on their countertop or inside their mobile device.

One way of differentiating AI that I don't often hear is by the two categories of weak AI and strong AI.

Weak AI (also known as “Narrow AI”) simulates intelligence. These technologies use algorithms and programmed responses and generally are made for a specific task. When you ask a device to turn on a light or what time it is or to find a channel on your TV, you're using weak AI. The device or software isn't doing any kind of "thinking" though the response might seem to be smart (as in many tasks on a smartphone). You are much more likely to encounter weak AI in your daily life.

Strong AI is closer to mimicking the human brain. At this point, we could say that strong AI is “thinking” and "learning" but I would keep those terms in quotation marks. Those definitions of strong AI might also include some discussion of technology that learns and grows over time which brings us to machine learning (ML), which I would consider a subset of AI.

ML algorithms are becoming more sophisticated and it might excite or frighten you as a user that they are getting to the point where they are learning and executing based on the data around them. This is called "unsupervised ML." That means that the AI does not need to be explicitly programmed. In the sci-fi nightmare scenario, the AI no longer needs humans. Of course that is not even close to true today as the AI requires humans to set up the programming, supply the hardware and its power. I don't fear the AI takeover in the near future.

But strong AI and ML can go through huge amounts of data that it is connected to and find useful patterns. Some of those are patterns and connections that itis unlikely that a human would find. Recently, you may have heard of the attempts to use AI to find a coronavirus vaccine. AI can do very tedious, data-heavy and time-intensive tasks in a much faster timeframe.

If you consider what your new smarter car is doing when it analyzes the road ahead, the lane lines, objects, your speed, the distance to the car ahead and hundreds or thousands of other factors, you see AI at work. Some of that is simpler weak AI, but more and more it is becoming stronger. Consider all the work being done on autonomous vehicles over the past two decades, much of which has found its way into vehicles that still have drivers.

Of course, cybersecurity and privacy become key issues when data is shared. You may feel more comfortable in allowing your thermostat to learn your habits or your car to learn about how you drive and where you drive than you are about letting the government know that same data. Discover the level of data we share online dong financial operations or even just our visiting sites, making purchases and our search history, and you'll find the level of paranoia rising. I may not know who you are reading this article, but I suspect someone else knows and is more interested in knowing than me.