number

Predicted distribution of Mersenne primes

We’ll construct a plot below using Python. Note that the conjecture is asymptotic, and so it could make poor predictions…

More bc weirdness

Actually no. It assumes that any single letter that could be a hex number is one. But in numbers with…

Estimating vocabulary size with Heaps’ law

Heaps’ law says that the number of unique words in a text of n words is approximated byV(n) = K nβwhere…

Proving that a choice was made in good faith

This is something I’ve helped companies with. It may be impossible to prove that a choice was not deliberate, but…

Detecting a short period in an RNG

The last couple posts have been looking at the Cliff random number generator. I introduce the generator here and look…

Fixed points of the Cliff random number generator

I ran across the Cliff random number generator yesterday. Given a starting value x0 in the open interval (0, 1),…

Data breach trends

This post gives a crude, back-of-the-envelope calculation to address the question. We won’t look at number of breaches per se…

Number of feet in a mile

Here are a couple amusing things I’ve run across recently regarding the number of feet in a mile. Both are…

Bootstrapping at scale in Snowflake

The answer, of course, is that we need a “good enough” alternative. We’re sampling after all, so the level of…

How Machine Learning Can Lower the Search Cost for Finding Better Hikes

How Machine Learning Can Lower the Search Cost for Finding Better HikesPerry JohnsonBlockedUnblockFollowFollowingJul 9I recently went on a weekend camping trip…

Scraping and Exploring the Entire English Audible Catalog

Scraping and Exploring the Entire English Audible CatalogToby MandersBlockedUnblockFollowFollowingJul 2Last week I wrote a script using the HTML-Requests package for Python…

The Political Twittersphere of the UK

The Political Twittersphere of the UKAn analysis of how the constituent parties and members of the UK government differ in their…

Visualisation of Information from Raw Twitter Data — Part 2

Lets check it out!For this we need to download and import the Botometer Python library, and get a key to…

A beginner’s guide to Kaggle’s Titanic problem

A beginner’s guide to Kaggle’s Titanic problemSumit MukhijaBlockedUnblockFollowFollowingJun 22Image source: FlickrSince this is my first post, here’s a brief introduction of what…

Beginning Python Programming — Part 14

Beginning Python Programming — Part 14An introduction to multi-threadingBob RoeblingBlockedUnblockFollowFollowingJun 20Photo by Franck V. on UnsplashIn part 13 of Beginning Python Programming, we covered…

Classification of Moscow Metro stations using Foursquare data

Classification of Moscow Metro stations using Foursquare dataStanislav RogozhinBlockedUnblockFollowFollowingJun 12This post is the capstone project of the Coursera IBM Data…

This is the second article of a list of publications about adquiring data from Twitter and using it to gain…

Optimizing the dynamic programming solution for the Knapsack Problem

Optimizing the dynamic programming solution for the Knapsack ProblemFabian TerhBlockedUnblockFollowFollowingJun 13Photo by Aperture Vintage on UnsplashPreviously, I wrote about solving a couple…

Analyzing Netease Music- Part I: Playlist

Analyzing Netease Music- Part I: PlaylistMartin LiuBlockedUnblockFollowFollowingJun 11Netease Music LogoNetease Music (https://music. 163. com/), a Chinese equivalent of Spotify, is a music…

So your friend suggests that you and they take turns digging…Let's say it takes you 100 minutes to finish this…

Best clustering algorithms for anomaly detection

How to use it?”Now we have the clusters…How can we detect anomalies in the test data?The approach I’ve followed to classify…

Eligibility Traces in Reinforcement Learning

Eligibility Traces in Reinforcement LearningZiad SALLOUMBlockedUnblockFollowFollowingJun 4Sometimes looking backward isn’t that badPhoto by Didier Provost on UnsplashWhat is Eligibility Traces ?In short…

Counting to infinity at compile time

If we can do 2 we can do 4, and 8, and 16. By this point we’re encoding the operation…

Python Pro Tip: Use Itertools, Generators, and Generator Expressions

Think of the memory such a list would occupy. It would be great if we had something that could just…

Clustering Evaluation strategies

Clustering Evaluation strategiesManimaranBlockedUnblockFollowFollowingMay 22Clustering is an unsupervised machine learning algorithm. It helps in clustering data points to groups. Validating the…