01. Introduction to AI & Machine Learning

AI and ML are already changing our world, but the next 20 years will be unlike anything mankind has imagined. While the effects will cross all boundaries, education may be a special case. Of all institutions, education has been minimally affected by technology to date. We can expect the affect to be particularly dramatic in education because schools and educators are unused to the kinds of fundamental change that has transformed, for example, healthcare. AI and ML currently rely on techniques from data science to collect, prepare, and analyze the information used to train and to query AI/ML systems. We used to call data science “statistics,” but common statistical processes were backward looking and did not fit the AI/ML goal of forward-looking prediction. Similarly, detailed mathematical analysis of data and algorithms searching for proofs, evidence of causality, and accuracy has given way to the goal of searching for trends and correlations. One effect to note for schools is that formal calculus has become less important as modern statistics for data science has become more important. Upcoming topics will expand these ideas, but know that data analysis skills are foundational to much modern AI and ML.

Machine Learning is a subset of Artificial Intelligence, and in some ways, ML is redefining AI. Traditionally, an AI system was “programmed” using the relationships that experts had identified in a particular field. A spam-detecting AI would be programmed to scan messages to find if they originate from a blacklisted address or if they contain forbidden key words. Today’s machine learning approach feeds the ML system lots of examples that are “tagged” as SPAM or NOT-SPAM, and the system learns the criteria that previously had been hand coded. Whereas we “teach” a child, we say that we “learn” an ML system; in other words we don’t actively teach an ML system. Instead, we give it an appropriate learning algorithm and feed it the data it needs to learn by itself. It’s an incredibly powerful approach, possibly the most powerful idea mankind has ever developed.

To understand these ideas enough to have informed opinions or to make decisions regarding AI and ML, you need to prepare yourself in three ways:
1. You need some conceptual understanding.
2. You need to understand how AI/ML apply to relevant real-world cases or examples.
3. You need to explore, hands-on, the kinds of tools used to implement AI and ML.
Each numbered topic in this AI/ML resource will deepen your experience in at least two of these ways (you are in “topic #1” now). Get to next topics from the MentalEdge “AI/ML” drop-down menu

Here’s a high-level overview by technology and AI legend Kevin Kelly discussing the three fundamental ways AI and Machine Learning will change human capability (14-min): How AI will bring on a second industrial revolution: https://www.ted.com/talks/kevin_kelly_how_ai_can_bring_on_a_second_industrial_revolution

Case Study: JD.com’s fully-automated warehouse in Shanghai, China. Such a warehouse needs about 5 employees now doing the work of perhaps 100 former employees. Note the steps and variety of tasks done by robots, sensors, and automated machinery. An AI system monitors the system to optimize productivity, receive materials, ship products, and to schedule maintenance and repair (3-min): https://www.youtube.com/watch?v=RFV8IkY52iY

What AI/ML can do, and what IT CANNOT DO: AI and Machine Learning are the most powerful tools ever created by mankind. They are the tools that come closest to being able to “think.” But stepping back, what does that really mean? AI and ML can give us more accurate and faster analysis and prediction. It can give us new categories of options. But it CANNOT judge which options humans should take. AI and ML can have morality built in, but it’s the morality of the designer/programmer. The hard choices of what to do about poverty, education, climate, equity, and all the other big problems must still be guided by humans outside the AI/ML environment. AI and ML can inform our choices and help us to achieve our goals, but WE still have to set those goals and values. Here is a 2019 TIME compilation of FOCUS and ACTION recommendations from a broad spectrum of leaders in their fields. Scroll through the lo-o-o-ng list, then pick two or three to read that interest you: http://time.com/collection/davos-2019/5502586/big-ideas-2020s/

Hands-On: Introduction to Data Science with R

Here is the SIMPLEST (really!) hands-on introduction to R and data analysis: https://www.youtube.com/watch?v=rUJiolRoH1M&t=540s

——————- outdated but useful material below this line ———————-

Follow David Langer’s 2014 video, Part 1 (1.5 to 2 hrs): https://www.youtube.com/watch?v=32o0DnuRjfg
The video guides you through installing R and RStudio, finding and loading data, basic R commands, and customizing your R environment with specific additional packages. Get updated data files and R tutorial file from Github repository: https://github.com/EasyD/IntroToDataScience

errata in video:
1. The current version of Titanic in Kaggle has an additional column labeled “PassengerId” that is incompatible with the tutorial. And all of the column labels use mixed case (with capital letters), whereas the tutorial files used only lower case column labels. Just delete the PassengerID column and change the column labels to all lowercase in the .csv files. Here’s how:
  BEFORE loading “train” and “test” into R, open the two .csv files in a spreadsheet program (Excel, Numbers, etc). Delete the entire first column (PassengerID). Next change all uppercase letters in the column labels to lowercase. Export the files to CSV, respectively named “train.csv” and “test.csv”. If you use Numbers on Mac, don’t save with Headers.
———
2. Use geom_bar() instead of geom_histogram(). The command name has changed in our current versions. Use the same data or parameters inside the parentheses.