AI and Data Science go hand in hand. One of the most common questions we get is: “What is AI?” Actually, AI is a broad term that covers a range of different fields, such as artificial neural networks, machine learning, and natural language processing.
The term Artificial Intelligence conjures the image of a bionic system or entity that can perform human functions and is human-like but non-biological.
Two perceptions of AI exist: the first is that robots can make our lives so much easier by taking over all our tasks and assignments and working tirelessly, freeing up our time and energy. The second is more motivated by the media – the rise of Artificial Intelligence over mankind – that it will decide that humans have done enough harm and wipe them off the earth.
There is a problem with both these perceptions or assumptions. AI is not a single homogeneous thing; it has degrees of sophistication as compared to human intelligence. There are three categories:
- Narrow AI – Does simple, specific tasks without human supervision, like image recognition or deducing a transcript from a sound file.
- General AI – Can infer data without prior knowledge, intuitively.
- Super AI – Has superhuman intelligence or intelligence that humans cannot acquire
Anything that has the capability to augment human ability can either be beneficial or dangerous, depending on how it is used. AI can either fix or exasperate human problems.
The perceptions of AI – good and bad – hinge on the fact that AI supersedes us in terms of ability and intelligence. But we do not have enough knowledge about the human brain. We cannot replicate brain functions, let alone exceed them. We are still at the first level of AI. Currently, the most advanced AI technology in the world is still within the Narrow AI category.
How are Machine Learning and Data Science related to AI?
Machine Learning or statistical learning and data science are subsets of AI. ML is the component at the heart of data science. ML is the set of techniques that make processes like image and voice recognition possible. The practice or profession that works on improving Narrow AI is called data science.
Introduction to Data Science
Data science is an emerging field and is one of the most coveted professions in the world today. Before we delve into other aspects, we must know that data science can essentially be broken up into stages; from ascertaining whether a problem can even be solved by Machine Learning and data science to getting a viable solution or answer.
There are five stages of the data science process:
- Goal – Determining or establishing the problem and objective
- Collection – Acquiring statistical data
- Processing – Identifying and processing data
- Machine Learning – Making inferences from the data collected
- Decision – Either passing on inferences in the form of recommendations or using the ML insights for the betterment of the company
Data science has three main pillars or components:
- Statistics, Mathematics, and Probability Theory – Includes Machine Learning
- Computer Science or Programming – Includes Big Data (large volumes of digital data)
- Domain Knowledge – Skills that include statistics and computer programming to solve domain problems
It is essential for a data scientist to have a background in statistics and math, know how to translate that knowledge into computer programming, and must know a thing or two of the domain they are operating in, for example, automotive or sales industries. A good data scientist must have technical skills and also the ability to employ those skills to solve problems in the industry or domain they are working in.
How does one become a data scientist?
A statistician who uses programming to perform statistical techniques is a data scientist. For a college graduate, it is essential to know enough programming to be able to write scripts for Machine Learning. For anyone wanting to pursue a career in data science, learning Python is an absolute must. It is a general-purpose programming language and the most dominant and widely used programming language across the world.
Apart from this, there are certain programming constructs that can be learned within the span of two weeks.
There are six programming constructs:
- Data types
- Data structures
Only after mastering these six constructs, along with statistical knowledge and probability theory, can one become a data scientist.
Domain knowledge also plays an important factor, but that depends more on which industry one wants to pursue their career in. It should never be discounted; a lot of times a company might give more weightage to an individual with more experience and knowledge in the particular domain over statistical and mathematical knowledge.
Data scientists who want to use a Machine Learning model must know how to interpret matrices with respect to the Machine Learning model.
Introduction to Machine Learning
All the examples of Narrow AI in today’s world can be made possible by ML. Statistics and probability theory is at the core of ML. ML makes a lot of statistical techniques redundant. Machines run algorithms and solve equations in an automated manner rather than having humans do it manually. All concepts in ML, like regression and clustering, are either directly statistics and probability theory or derived from it.
The following six mathematical topics are crucial to Machine Learning:
- Statistics (Script of statistics and Inferential statistics) and Probability Theory
- Calculus (single and multiple variables)
- Linear Algebra
- Vector and Tensor Calculus
- Differential Equations
- Differential Geometry
There are three categories of Machine Learning:
- Supervised learning, i.e., Prediction – Facilitated mainly by regression (a statistical technique that examines existing data and either fills in gaps or makes inferences about future data) used to find quantities.
- Unsupervised learning, i.e., Patterns – Finding unidentifiable patterns.
- Reinforceable learning, i.e., Actions – Finding optimal actions in a particular situation, for example, training an autonomous vehicle.
Artificial/Simulated Neural Networks
Advanced ML is made possible through Deep Learning by artificial neural network training. This is the mathematical representation of the way the human brain functions.
Deep Learning tries to create a simulation of the human neuron: a source of input that is processed by the nucleus or the perceptron. This produces an output that travels to form a concentrated form of input that other perceptrons can reprocess. At the core of every perceptron is a single mathematical function that helps solve the statistical equation.
With artificial neural network training and deep algorithms, ML in Narrow AI can help further a number of technological advancements, like:
- Facial recognition
- Image recognition
- Autonomous driving
- Optical character recognition
- Speech recognition
What’s the difference between computer programming and Machine Learning?
Traditional programming preceded Machine Learning.
For traditional programming, the programmer writes the rules, and the program runs arguments. Based on the rules specified, the program will output answers. For example, if the user types the correct user id and password on a login screen, they will be logged in. If the combination is incorrect, the answer they receive will be negative or wrong. The algorithm is based specifically on the rules of the programmer.
The machine will be told what the correct answers are for Machine Learning, and the machine will identify the rules. For example, if we teach the machine to know what a malignant tumor looks like, it can learn how to identify what makes the tumor malignant on its own. So, the answers are inputted in the form of data, and the machine outputs the rules, as opposed to programming.
To conclude, data science, Machine Learning, and AI are intricately connected to one another but should not be used interchangeably. Learn more about all things data science and AI through our Data Science & Machine Learning Bootcamp.