## Posts

### New home for Datageeko.com; Bye Wordpress!

othersFun weekend project for migrating Wordpress to Jekyll.### The Search for Universal Correlation

data-scienceThere are various way to measure association between variables. Is there an universal one to rule them all?

### How to Perform EDA Efficiently

data-scienceYou have been tasked to crunch a dataset and extract insights in less than 24 hours. Before we put our heads down and grind harder, is there a more efficient strategy to go about it?

### Statistical Bias and Paradoxes that creep up in your data analysis

data-science random-questions statisticsStatistical bias could creep up on our analysis and caused us to communicate the wrong insights and drive home the wrong conclusions.

### Practical A/B Testing

data-science statisticsA Jupyter notebook is embedded within this post. Visit the notebook here if it cannot render properly in your browser.

### Test your knowledge - Tricky Probability questions with answers

statistics test-your-knowledgeThis post is displayed directly from my notebook @ Github. You might want to view it on Github directly if it doesn’t render properly on your browser.

### Let's say we have 1 million app rider journey trips. We want to build a model to predict ETA after a rider makes a ride request...

data-science machine-learning test-your-knowledge..how would we know if we have enough data to create an accurate enough model?

### Let's say you have a categorical variable with thousands of distinct values, how would you encode it?

machine-learning test-your-knowledgeOne-hot encoding is out of the question since a large number of distinct values will result in large dimensionality problems(Curse of Dimensionality) in modeling stage.

### Let's say we want to build a model to predict booking prices for a hotel booking company. Between linear regression and random forest regression, which model would perform better and why?

machine-learning statistics test-your-knowledgeBefore we quickly answer “Random Forest”, let’s take a step back and put on our structured thinking cap to ask ourselves why and perhaps in real life, companies might take the other choice.

### Illustrated guide to Hypothesis testing using Python

data-science python statisticsThis is a hands-on guide to hypothesis testing, where we use both “hand coded” and the common statistical libraries, to calculate different statistical test.