Reference List

Useful tools

Plotly: interactive plots

Stackedit.io -- good way to document. Produces html. Put notes in here not word.

Notion: team documents and organisation.

Export HTML from Jupyter Notebook, and send HTML document. Can include code, but hide it with a code snippet. Can include interactive plots.

Reddit data science

Python graph gallery: lots of examples with code. The Python Graph Gallery -- Visualizing data -- with Python

You tube: "statquest" ais a good level for statistics and model learning e.g. SVM explained.

Regex builder: interactive regex. RegExr: Learn, Build, & Test RegEx

Bookmarks

Data Science

[Kaggle: Your Home for Data Science]{.underline}

Cheat Sheets

[Probability Cheatsheet]{.underline}

[pandas/Pandas_Cheat_Sheet.pdf at master · pandas-dev/pandas · GitHub]{.underline}

[GitHub - juliangaal/python-cheat-sheet: Python for Data Science - NumPy, Matplotlib, Pandas, SciKit Learn ...]{.underline}

[Essential Cheat Sheets for Machine Learning and Deep Learning Engineers]{.underline}

[Data Science Free | Cheat Sheets]{.underline}

[Scikit_Learn_Cheat_Sheet_Python.pdf]{.underline}

Data Sets

[Dataset Search]{.underline}

[https://archive.ics.uci.edu/ml/index.php]{.underline}

Favorite Modules

[Missingno]{.underline} -- this is a good way to analyse nulls.

[Pywaffle]{.underline}

Favorite Notebooks

Learning

[Reducing DataFrame memory size by ~65% | Kaggle]{.underline}

[Image Pre-processing for Wild Images]{.underline}

[Gini Coefficient - An Intuitive Explanation | Kaggle]{.underline}

[Interactive Porto Insights - A Plot.ly Tutorial | Kaggle]{.underline}

[A Data Science Framework: To Achieve 99% Accuracy | Kaggle]{.underline}

Good Examples

[https://www.kaggle.com/subinium/the-hitchhiker-s-guide-to-the-kaggle]{.underline}

[Introduction to Ensembling/Stacking in Python]{.underline}

General Resources

[80+ Free Data Science Books - Data Science Central]{.underline}

[How to Become a Data Scientist: The Definitive Guide]{.underline}

[[How to Level Up as a Data Scientist (Part 1) -- Towards Data Science

Medium]{.underline}](https://medium.com/towards-data-science/how-to-level-up-as-a-data-scientist-part-1-9ea6a775f239)

[Python Data Science Handbook | Python Data Science Handbook]{.underline}

[Data Science at the Command Line]{.underline}

[Scraping Playlist]{.underline}

[Seeing Theory]{.underline}

[List of Data Science Resources : datascience]{.underline}

[How to get your first job in Data Science? -- The Mission -- Medium]{.underline}

[GitHub - jakevdp/PythonDataScienceHandbook: Jupyter Notebooks for the Python Data Science Handbook]{.underline}

[GitHub - ben519/MLPB: Machine Learning Problem Bible | Problem Set Here >>]{.underline}

[Explained Visually]{.underline}

[Complete List of Data Science Resources - Google Sheets]{.underline}

[Jupyter Notebook Shortcuts]{.underline}

Help Resources

[API Reference --- scikit-learn 0.21.3 documentation]{.underline}

[Choosing the right estimator]{.underline}

[Data Visualization -- How to Pick the Right Chart Type?]{.underline}

[seaborn: statistical data visualization --- seaborn 0.9.0 documentation]{.underline}

[The Python Graph Gallery -- Visualizing data -- with Python]{.underline}

[User guide: contents --- scikit-learn 0.21.3 documentation]{.underline}

[Which Method to Use?]{.underline}

[Markdown Cheatsheet]{.underline}

[What would you like to show?]{.underline}

[Plotly Python Graphing Library]{.underline}

Stats Learning

Machine Learning Methods

[Support Vector Machines]{.underline}

Regression Methods

[Logistic Regression]{.underline}

[Stochastic Gradient Descent]{.underline}

[Gradient Descent]{.underline}

[Regularization Part 1: Ridge Regression]{.underline}

Clustering

[K-means Clustering]{.underline}

Classification

[K-nearest neighbors]{.underline}

[Naïve Bayes Classifier]{.underline}

[SVC RBF Overfitting Solution]{.underline}

[In Depth: Parameter tuning for SVC]{.underline}

Stats and Machine Learning Terms

[ROC and AUC]{.underline}

[Precision and recall]{.underline}

[Machine Learning: Bias VS. Variance - Becoming Human: Artificial Intelligence Magazine]{.underline}

[Machine Learning Fundamentals: Bias and Variance]{.underline}

Questions

[How is the k-nearest neighbor algorithm different from k-means clustering? - Quora]{.underline}

[When to avoid Random Forest?]{.underline}

[What is the difference between Support Vector Machine and Support Vector Regression?]{.underline}

Tools

[dbdiagram.io - Database Relationship Diagrams Design Tool]{.underline}

[Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript]{.underline}

[RegExr: Learn, Build, & Test RegEx]{.underline}

[StackEdit -- In-browser Markdown editor]{.underline}

[Sci-Hub: removing barriers in the way of science]{.underline}

Visualization Inspiration

[Top 50 matplotlib Visualizations - The Master Plots (w/ Full Python Code) | ML+]{.underline}

[How to Generate FiveThirtyEight Graphs in Python]{.underline}

Look Into

[Welcome to Metaflow - Metaflow]{.underline}

[Metaflow: Netflix has open-sourced their Python library for data science project management : datascience]{.underline}

[L1 and L2 Regularization - Data Driven Investor - Medium]{.underline}

[A One-Stop Shop for Principal Component Analysis - Towards Data Science]{.underline}