Useful python Big Data “counter” technique.

# Category Archives: Data Science

# Naïve Bayes Model

# AI & Big Data are Related

# Scaling Up System For 100-milion Users

# 3 Mathematical Laws Data Scientists Need To Know :Zipf’s Law

https://www.kdnuggets.com/2021/03/3-mathematical-laws.html

Zipf’s Law (Word Frequency)

# Python Libraries for Machine Learning

# Build First Data Science Apps

# Naive Bayes | Gaussian Naive Bayes with Hyperpameter Tuning in Python

# 朴素贝叶斯分类ML : Naive Bayes Probability

# Homology

**Part 1**: “Homology” without Pre-requisites, except “Function” (he un-rigourously interchanges with “Mapping”, although Function is stricter with 1-and-only-1 Image) .

**Part 2**: Simplex (单纯形)

Topology History : Euler Characteristic eg. (V – E + R = 2) ,

Poincaré invention

This video uses Algebra of point, line, triangle… to explain a Simplex (plural: Simplices) in R^{n} Space, that is organizing the n-Dimensional “Big Data” data points into Simplices, then (future Part 3, 4…) compute the “holes” (or pattern called Persistent Homology).

Part 3: Boundary

Part 3 justifies why triangles (formed by any 3 data points) called “Simplex” 单纯形(plural: Simplices) are best to fill any Big Data Space.

# Python for naive Bayes algorithm

# Information Theory for Data Science

# Inferential Statistics

# Differential Equations Versus Machine Learning

Differential Equations Versus Machine Learning | by Col Jung | Nov, 2020 | Medium

# TDA : Topological Data Analysis

Data Science now applies Algebraic Topology : Persistent Homology.

# Why Use Ensemble Learning?

# Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science

# 李彦宏Baidu CEO : Internet 3 Episodes : PC->Mobile ->AI

李彦宏剑桥大学演讲

李彦宏 Baidu CEO Cambridge Speech 剑桥大学演讲

《3 waves of Internet》:

1) PC- based (1997-)

- Search Webpages
- 6-month software update cycle

2) Mobile-based (2010 -)

- “APP” is born
- Eco-System : eg. Apple Appstore, Google PlayStore
- O2O (Online to Offline) : Same day Hotel booking/Restaurant /…
- SW Update everyday few times

3) AI-based (2017 – now)

- Voice recognition sans keyboard input
- Image recognition (eg. Customer ePayment :McDonald’s )
- Natural language Pattern NLP (Salesman Virtual Assistant)

# How much Math you need in Data Science?

# Time Series Forecasting with Python

“Hands-on Time Series Forecasting with Python” by Idil Ismiguzel https://link.medium.com/KulDiXl816

# Functional Programing in Data Science Projects

“Functional Programing in Data Science Projects” by Nathanael Weill https://link.medium.com/UiysKbFl16

# MATH en supérieure -les applications-

The French Math distinguishes the Correspondence (对应):

in two definitions : Mapping (Application) & Function = **Maximum 1 arrow** from E to F.

- Mapping (L’Application 映射)

- Function (La Fonction 函数):
__ONLY__image in F**ONE**

Counter Examples:

# Free Online Math in Data Science

# How can maths fight a pandemic?

Key points:

- SIR Model: used since 1910 for flu pandemic
- SEIR Model: “E” for “Exposed” people but asymtomic for COVID19.
- Social Distancing : 60% reduction effect on
(Reproduction number: 1.4 ~ 2.5)*Ro*

https://plus.maths.org/content/how-can-maths-fight-pandemic

…

# The Math Behind Social Distancing

# Speak Math, not Code

Coding is about algorithm which is Math. The AI, Machine Learning, deep learning, Big Data, 5G Polar Codes, etc are all about Maths.

# MIT New Course (Prof Gilbert Strang) : Linear Algebra and Learning From Data

[**MIT OCW Online Course Videos**]

[Full Video]

# 丘城桐：基础数学和AI, Big Data

AI and Big Data are Twins, their Mother is Math.

“AI 3.0“ today, although impressive in “DeepLearning“, is still using “primitive” high school Math, namely:

- Statistics,
- Probability (Bayesian) ,
- Calculus (Gradient Descent)

AI has not taken advantage of the power of post-Modern Math invented since WW II, esp. IT related, ie :

- Category Theory (Functional Programming),
- Algebraic Topology : Homology (Big Data Analytics)
- Homotopy Type Theory ‘HoTT’ (Machine Proof Math Theorems) .

That is the argument of the Harvard Math Dean Prof ST Yau 丘城桐 (First Chinese Fields Medalist), who predicts the future “AI 4.0“ can be smarter and more powerful.

… Current AI deals with Big Data:

- Purely Statistical approach and experience-oriented, not from Big Data’s inherent Mathematical structures (eg. Homology or Homotopy).
- The Data analytical result is environment specific, lacks portability to other environments.

…

3. Lack effective Algorithms, esp. Algebraic Topology computes Homology or Co-homology using Linear Algebra (Matrices).

4. Limited by Hardware Speed (eg. GPU), reduced to layered-structure problem solving approach. It is a simple math analysis, not the REAL **Boltzmann Machine** which finds the most Optimum solution.

**Notes**:

**AI 1.0** : 1950s by Alan Turing, MIT John McCarthy (coined the term “AI”, Lisp Language inventor).

**AI 2.0** : 1970s/80s. “Rule-Based Expert Systems” using Fuzzy Logic.

[**AI Winter** : 1990s / 2000s. Failed ambitious Japanese “5th Generation Computer” based on Prolog-based “Predicate” Logic]

**AI 3.0** : 2010s – now. “DeepLearning” by Prof Geoffry Hinton using primitive Math (Statistics, Probability, Calculus Gradient Descent)

**AI 4.0 :** Future. Using “Propositional Type” Logic, Topology (Homology, Homotopy) , Linear Algebra, Category.

# Mathematics behind Machine Learning – The Concepts you Need to Know

https://www.analyticsvidhya.com/blog/2019/10/mathematics-behind-machine-learning/

Data Science & Machine Learning (AI is a sub-discipline) overlap but not the same:

# Data Science Math

# Statistical Significance Explained

“Statistical Significance Explained” by Will Koehrsen

Normal Distribution

Convert to Z-score

p-Value vs Alpha (0.05% = noise)

# A Programmer’s Regret: Neglecting Math at University – Adenoid Adventures

Advanced Programming needs Advanced Math: eg.

Video Game **Animation**: Verlet Integration

**AI**: Stats, Probability, Calculus, Linear Algebra

**Search Engine** : PageRank: Linear Algebra

**Abstraction** in Program “Polymorphism” : Monoid, Category, Functor, Monad

Program “**Proof**” : Propositions as Types, HoTT

https://awalterschulze.github.io/blog/post/neglecting-math-at-university/

Abstraction: Monoid, Category

Category

# Why do Neural Networks Need an Activation Function? | Data Stuff

http://www.datastuff.tech/machine-learning/why-do-neural-networks-need-an-activation-function/

**Activation function**: Non-Linear Function

..

**What if no Activation function** : Affine Transformation

# AI with Advanced Math helps in discovering new drugs

**Advanced Mathematical Methods** with **AI** is a powerful tool:

**Algebraic Topology**(Persistent**Homology**)- Differential Geometry
- Graph Theory

https://sinews.siam.org/Details-Page/mathematical-molecular-bioscience-and-biophysics-1

# Learn NUMPY in 5 minutes – BEST Python Library!