AI For Everyone - Learning Week 2 Building AI Projects


Workflow of a machine learning project:


Key steps for ML project – speech recognition/self-driving car
1. collect data
2. train data,
2.1 iterate many times until good enough
3. deploy model
3.1get data back, maintain/update-model

Workflow of a data science project:

Key steps for DS project – optimizing a sales funnel/manufacturing line
1. collect data
2. analyze data
2.1 iterate many times to get good insights
3. suggest hypotheses/actions
3.1 deploy changes
3.2 re-analyze new data periodically

Every job function needs to learn how to use data:
From sales, manufacturing line manager, recruiting to marketing, you can implement data science to collect data and feed into machine learning to get better results.

How to choose an AI project




Brainstorming framework:
- Think about automating tasks rather than automating jobs
- What are the main driver of business value?
- What are the main pain points in your business?

You can make progress even without big data
- Having more data almost never hurts
- Data makes some businesses (like web search) defensible
- But with small datasets, you can still make progress

Due diligence on project
Technical diligence
- Can AI system meet desired performance
- How much data is needed
- Engineering time
Business diligence
- Lower costs
- Increase revenue
- Launch new product or business

Build vs. Buy
- ML projects can be in-house or outsourced
- DS projects are more commonly in-house
- Some things will be industry standard – avoid building those

Working with an AI team:
1. Specify an acceptance criteria for the project
1.1 AI teams group data into two main datasets. The first called the training set and the second called the test set
1.2 training set helps computers figure out some mapping from A to B
1.3 test set helps AI team evaluate their learning algorithms performance
1.4 two different test sets – development/deaf/validation test sets, technical reason

Pitfall: Expecting 100% accuracy
- Limitations of ML
- Insufficient data
- Mislabeled data

- Ambiguous labels

Technical tools for AI teams:
ML Frameworks:





- CNTK




- R

- Weka

Research publications:


Open source repositories


CPU vs. GPU

CPU: Computer professor (Central Processing Unit)
GPU: Graphic Processing Unit

Cloud vs. On-premises

Cloud: you rent compute servers such as from Amazon's AWS, or Microsoft's Azure, or Google's GCP in order to use someone else's service

On-premises: an On-premises deployment means buying your own compute servers and running the service locally in your own company.

Comments