AI

WTF is machine learning?

Comment

Image Credits: Bryce Durbin

While the number of headlines about machine learning might lead one to think that we just discovered something profoundly new, the reality is that the technology is nearly as old as computing.

It’s no coincidence that Alan Turing, one of the most influential computer scientists of all time, started his 1950 treatise on computing with the question “Can machines think?” From our science fiction to our research labs, we have long questioned whether the creation of artificial versions of ourselves will somehow help us uncover the origin of our own consciousness, and more broadly, our role on earth. Unfortunately, the learning curve on AI is really damn steep. By tracing a bit of history, we should hopefully be able to get to the bottom of wtf machine learning really is.

If my big-data is big enough can I create intelligence?

Our first attempts at replicating ourselves involved jamming machines full of information and hoping for the best. Seriously, there was a time when the prevailing theory of consciousness was that it could arise from just a ton of information connected together. Google could be seen by some as the culmination of this vision, but while the company has indexed 30 trillion webpages, I don’t think anyone expects our search engines to start asking us if there is a god.

Rather, the beauty of machine learning is that instead of pretending computers are human and simply feeding them with knowledge, we help computers to reason and then let them generalize what they’ve learned to new information.

While not well understood, neural networks, deep learning, and reinforcement learning are all machine learning. They’re all methods of creating generalized systems that can perform analysis on new data. Put a different way, machine learning is one of many artificial intelligence techniques, and things like neural networks and deep learning are just tools that can be used to build better frameworks with broader applications.

Back in the 50s, our computing power was limited, we didn’t have access to big-data, and our algorithms were rudimentary. This meant that our ability to advance machine learning research was quite limited. However, that didn’t stop people from trying.

Back in 1952, Arthur Samuel made a chess program using a very basic form of AI called alpha beta pruning. This is a method for reducing computational load when working with search trees that represent data, but it’s not always the best strategy for every problem. Even neural networks showed their face in yesteryear with Frank Rosenblatt’s perceptron.

A complex sounding model that you should read about anywayscreen-shot-2016-10-23-at-1-47-20-pm

The perceptron was way ahead of its time, leveraging neuroscience to advance machine learning. On paper, the idea looked something like the sketch to the right.

To understand what it’s doing, you first have to understand that most machine learning problems can be broken down into either classification or regression. Classifiers are used to categorize data, while regression models broadly deal with extrapolating out trends to make predictions.

The perceptron is an example of a classifier — it takes a set of data and splits it into multiple sets. In this case, the existence of two traits with respective weights is enough for this object to be classified in the “green” category. Classifiers today separate the spam from your inbox and detect fraud for your bank.

Rosenblatt’s model uses a series of inputs, think features like length, weight, color, and assigns each of them a weight. The model then continuously adjusts the weights until an output is reached that falls within an accepted margin of error.

For example, one could input that the weight of an object that happens to be an apple is 100 grams. The computer doesn’t know it’s an apple, but the perceptron can classify the object as an apple-like-object or a non-apple-like-object by adjusting the classifier’s weights with respect to a known training set of data. Once the classifier has been tuned, it can ideally be reused on a data set it has never been exposed to before to classify unknown objects.

It’s ok, even AI researchers are confused by this stuff

Boy with a computerThe perceptron is just one example of many early advances made in machine learning. Neural networks are sort of like big collections of perceptrons working together, a lot like how our brains and neurons work, which is where the name comes from.

Skipping forward a few decades, advancements in AI have continued to be about replicating the way the mind works rather than simply replicating what we perceive its contents to be. Basic, or “shallow”, neural networks are still in use today, but deep learning has caught on as the next big thing. Deep learning models are neural networks with more layers. A totally reasonable reaction to this incredibly unsatisfying explanation is to ask what I mean by layers.

To understand this, we have to remember that just because we say a computer can organize cats and humans into two different groups, the computer itself doesn’t process the task the same way a human would. Machine learning frameworks take advantage of the idea of abstraction to accomplish tasks.

To a human, faces have eyes. To a computer, faces have pixels that are light and dark that make up some abstraction of lines. Each layer of a deep learning model lets the computer identify another level of abstraction of the same object. Pixels to lines to 2D to 3D geometry.

Despite overwhelming stupidity, computers already passed the Turing test

This fundamental difference in the way humans and computers evaluate the world presents a serious challenge to creating true artificial intelligence. The Turing test was conceptualized to evaluate our progress in AI, but it largely ignores this reality. Turing’s test is a behaviorist test focused on evaluating the ability of computers to emulate human output.

However, mimicry and probabilistic reasoning are, at best, only part of the mystery of intelligence and consciousness. Some believe we successfully passed the Turing test in 2014, when a machine convinced 10 out of 30 scientists that it was human during a five minute keyboard conversation (and yet Siri still tries to search Google for every third thing we ask her).

So should I get my jacket for the AI winter?

Despite progress, scientists and entrepreneurs alike have been quick to over-promise the capabilities of AI. The resulting boom and bust cycles are commonly referred to as AI winters.

We have been able to do some unbelievable things with machine learning, like classify objects in video footage for autonomous cars and predict crop yields with satellite imagery. Long short-term memory is helping our machines deal with time-series for things like sentiment analysis in videos. Reinforcement learning, takes ideas from game theory, and includes a mechanism to assist learning through rewards. Reinforcement learning was a key part of how Alpha Go was able to upset Lee Sodol.

That said, despite all progress, the great secret of machine learning is that while we usually know the inputs and outputs of a given problem, and the explicitly programmed code to act as the intermediary, we can’t always identify how the model is going from input to output. Researchers refer to this challenge as the black box problem of machine learning.

Before getting too discouraged, we must remember that the human brain itself is a black box. We don’t really know how it works and cannot examine it at all levels of abstraction. I would be labeled crazy if I asked you to dissect a brain and point to the memories held within it. However, not being able to understand something isn’t game over, it’s game on.

This post introduced many of the basic concepts underpinning machine learning but leaves plenty on the table for future WTF is pieces. Deep learning, reinforcement learning and neural nets could all stand on their own but hopefully after reading this post you can visualize the field itself and draw connections to many of the companies we cover daily on TechCrunch.

More posts from the WTF is series

WTF is a container?

WTF is clickbait?

WTF is a mirrorless camera? 

 

More TechCrunch

Some Indian government websites have allowed scammers to plant advertisements capable of redirecting visitors to online betting platforms. TechCrunch discovered around four dozen “gov.in” website links associated with Indian states,…

Scammers found planting online betting ads on Indian government websites

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The deck included some redacted numbers, but there was still enough data to get a good picture.

Pitch Deck Teardown: Cloudsmith’s $15M Series A deck

The company is describing the event as “a chance to demo some ChatGPT and GPT-4 updates.”

OpenAI’s ChatGPT announcement: What we know so far

Unlike ChatGPT, Claude did not become a new App Store hit.

Anthropic’s Claude sees tepid reception on iOS compared with ChatGPT’s debut

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. Look,…

Startups Weekly: Trouble in EV land and Peloton is circling the drain

Scarcely five months after its founding, hard tech startup Layup Parts has landed a $9 million round of financing led by Founders Fund to transform composites manufacturing. Lux Capital and Haystack…

Founders Fund leads financing of composites startup Layup Parts

AI startup Anthropic is changing its policies to allow minors to use its generative AI systems — in certain circumstances, at least.  Announced in a post on the company’s official…

Anthropic now lets kids use its AI tech — within limits

Zeekr’s market hype is noteworthy and may indicate that investors see value in the high-quality, low-price offerings of Chinese automakers.

The buzziest EV IPO of the year is a Chinese automaker

Venture capital has been hit hard by souring macroeconomic conditions over the past few years and it’s not yet clear how the market downturn affected VC fund performance. But recent…

VC fund performance is down sharply — but it may have already hit its lowest point

The person who claims to have 49 million Dell customer records told TechCrunch that he brute-forced an online company portal and scraped customer data, including physical addresses, directly from Dell’s…

Threat actor says he scraped 49M Dell customer addresses before the company found out

The social network has announced an updated version of its app that lets you offer feedback about its algorithmic feed so you can better customize it.

Bluesky now lets you personalize main Discover feed using new controls

Microsoft will launch its own mobile game store in July, the company announced at the Bloomberg Technology Summit on Thursday. Xbox president Sarah Bond shared that the company plans to…

Microsoft is launching its mobile game store in July

Smart ring maker Oura is launching two new features focused on heart health, the company announced on Friday. The first claims to help users get an idea of their cardiovascular…

Oura launches two new heart health features

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI considers allowing AI porn

Garena is quietly developing new India-themed games even though Free Fire, its biggest title, has still not made a comeback to the country.

Garena is quietly making India-themed games even as Free Fire’s relaunch remains doubtful

The U.S.’ NHTSA has opened a fourth investigation into the Fisker Ocean SUV, spurred by multiple claims of “inadvertent Automatic Emergency Braking.”

Fisker Ocean faces fourth federal safety probe

CoreWeave has formally opened an office in London that will serve as its European headquarters and home to two new data centers.

CoreWeave, a $19B AI compute provider, opens European HQ in London with plans for 2 UK data centers

The Series C funding, which brings its total raise to around $95 million, will go toward mass production of the startup’s inaugural products

AI chip startup DEEPX secures $80M Series C at a $529M valuation 

A dust-up between Evolve Bank & Trust, Mercury and Synapse has led TabaPay to abandon its acquisition plans of troubled banking-as-a-service startup Synapse.

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

The problem is not the media, but the message.

Apple’s ‘Crush’ ad is disgusting

The Twitter for Android client was “a demo app that Google had created and gave to us,” says Particle co-founder and ex-Twitter employee Sara Beykpour.

Google built some of the first social apps for Android, including Twitter and others

WhatsApp is updating its mobile apps for a fresh and more streamlined look, while also introducing a new “darker dark mode,” the company announced on Thursday. The messaging app says…

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

Plinky lets you solve the problem of saving and organizing links from anywhere with a focus on simplicity and customization.

Plinky is an app for you to collect and organize links easily

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

For cancer patients, medicines administered in clinical trials can help save or extend lives. But despite thousands of trials in the United States each year, only 3% to 5% of…

Triomics raises $15M Series A to automate cancer clinical trials matching

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Tap, tap.…

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and…

Reddit locks down its public data in new content policy, says use now requires a contract

Eva Ho plans to step away from her position as general partner at Fika Ventures, the Los Angeles-based seed firm she co-founded in 2016. Fika told LPs of Ho’s intention…

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

In a post on Werner Vogels’ personal blog, he details Distill, an open-source app he built to transcribe and summarize conference calls.

Amazon’s CTO built a meeting-summarizing app for some reason