## Hypothesis Testing: how to determine significance ????

The main question we are interested in answering is:Does discount amount have a statistically significant effect on the amount of…

## Neural networks for dummies: a quick intro to this fascinating field

You will also learn some buzzwords to impress the family at the dinner table, especially if you follow the reading…

## Entropy is a measure of uncertainty

Given certain assumptions (and foreshadowing an important result mentioned below), entropy is the measure of uncertainty.By the way, when I…

## What Does an Ideal Data Scientist’s Profile Look Like? — Findings from Analyzing 1000 Indeed Job Postings

Because of the broad nature of the Data Scientist profession, other languages also play import roles.In summary, the top languages…

## With These New Additions, AWS SageMaker is Starting to Look More Real for Data Scientists

This week during the re:Invent conference, AWS announced a series of new releases that bring its SageMaker platform closer to…

## Solving NLP task using Sequence2Sequence model: from Zero to Hero

In short, NER is a task of extracting Name Entities from a sequence of words (a sentence)..Here I’m going to…

## Get started with Machine Learning by building a simple project

No matter what the input is the function outputs a value between 0 and 1.We have implemented this cost fuction…

## A Simple Guide to creating Predictive Models in Python, Part-2b

Therefore the below method is easier and scalable# first just take a look at all the columnslist(deep_feat.columns)Output:['CreditScore', 'Geography', 'Gender', 'Age',…

## A Beginner’s Guide to Plotting ‘FiveThirtyEight Like’ Visualizations

To do this we define a function that loops through the “Group” column and creates a new “Occupations” column.See added…

## A Guide to Decision Trees for Machine Learning and Data Science

A Guide to Decision Trees for Machine Learning and Data ScienceGeorge SeifBlockedUnblockFollowFollowingNov 30Decision Trees are a class of very powerful Machine…

## Using NLP to Identify Redditors Who Control Multiple Accounts

Using NLP to Identify Redditors Who Control Multiple AccountsPhoto by Daniel Monteiro on UnsplashIntroductionI built a model that can determine if two…

## What’s the big deal about Decentralized Consensus?

Systems built upon decentralized consensus methods are inherently tamper-proof, censorship-resistant, and permissionless.The cryptographic and game theoretic principles underlying decentralized consensus…

## Data Science “Paint by the Numbers” with the Hypothesis Development Canvas

Figure 2:  Hypothesis Development CanvasThe Hypotheses Development Canvas includes the following:A Vision Workshop accelerates the collaboration between the business stakeholder…

In my opinion, a productive analyst needs to understand the business, the data and the tools.Your new hire may have…

## Key Takeaways from AI Conference SF, Day 2: AI and Security, Adversarial Examples, Innovation

DSAs provide a great opportunity for innovation as hardware and software are designed from the scratch focused on a very…

## Why do I Call Myself a Data Scientist?

It’s not particularly easy to define Data Science as a whole, or subject, I’ve done it in other articles:Creating Intelligence…

## BIG, small or Right Data: Which is the proper focus?

These humongous data sets are collected via many different means including computer networks, social media profiles, web browsing histories, mobile…

## How to use Machine Learning and Quilt to Identify Buildings in Satellite Images

How to use Machine Learning and Quilt to Identify Buildings in Satellite ImagesJared Yamaoka was an Insight Data Science Fellow…

## Things you should know when traveling via the Big Data Engineering hype-train

These were also the things I asked candidates for Big Data Engineer position in my previous company.I will try to…

## Stop Installing Tensorflow Using pip for Performance Sake!

By Michael Nguyen, Software and Machine Learning EngineerStop installing Tensorflow using pip!.If you aren’t already using conda, I recommend that…

## The Big Data Game Board™

Figure 2: The Big Data Game Board™Spin the dial and see if you can avoid the following Big Data (and…

## How Data Science Is Improving Higher Education

Increasingly, colleges and universities, as well as governments, are using data science to improve the ways educational institutions do everything…

## Using Uncertainty to Interpret your Model

We’ll dive into this in a moment, but first, let’s talk about different types of uncertainty.There are different types of…

## The Most in Demand Skills for Data Scientists

By Jeff Hale, Data Scientist Focused on Machine Learning – CoFounder and COO at E-commerce Firms.I scoured job listing websites…

## 5 Reasons Why You Should Use Cross-Validation in Your Data Science Projects

Before I present you my five reasons to use cross-validation, I want to briefly go over what cross-validation is and…