Data on the Edge: Handling Outliers

Data on the Edge: Handling Outliers

Before we tackle how to handle them, let’s quickly define what an outlier is.  An outlier is any data point that is distinctly different from the rest of your data points. When you’re looking at a variable that is relatively normally distributed, you can think of...
Facebook, Data Science and Valentine’s Day

Facebook, Data Science and Valentine’s Day

Today is February 14th, also known as Valentine’s Day or the Feast of Saint Valentine, depending on where you are in the world.  In some places it’s a holiday.  The day was first associated with romantic connotations thanks to Chaucer’s poetry in the...
Detection and Correction: Data Prep Pitfalls

Detection and Correction: Data Prep Pitfalls

jon macmillan senior data analyst Seven challenges to be aware of when exploring your data According to an article written by Northeastern University, the total amount of data in the world was 4.4 zettabytes in 2013 (That’s 4,400,000,000,000 gigabytes!!). We are...
Page 21 of 36« First...10...1920212223...30...Last »