My Open-Source Machine Learning Journey BeginsMy new blog, A Human Learning Machine Learning, Launched on January 1stAlex OrtizBlockedUnblockFollowFollowingJan 18In technology and much else, there’s an important connection between the research and developments of the past and the actual happenings of the present.
Exploring that connection is essential to the process of learning.
History is a great teacher, especially because, since things don’t happen in a vacuum, it’s usually wise to consider the present in relation to the past.
There’s a second, equally important connection, something of an instructive question mark between what’s actual today and what’s possible tomorrow, that can help to accelerate the learning process.
To understand and guide what you do today, you should also ask about what you might do tomorrow.
The advantage of following that learning arc is that you’re able to connect knowledge from the past to context for the present and to questions about the potential of the future.
It’s an exercise we all do in our personal lives, the equivalent of, “Where have I been, how did I get here, and what am I capable of accomplishing?” In technology, this process is especially important if you’re trying to solve problems in a way that preserves as few blind spots as possible: if you wish to apply technology to solutions that endure, you need knowledge, context, curiosity — and perspective.
This brings me to why I decided to launch a machine learning blog.
Now that distributed systems and decentralized blockchains are being explored in earnest out in the market, one of the technology areas I want to better understand is that of machine learning.
The reason is that we should begin to ask how data stored immutably in blockchains might be accessed and analyzed by AI software in the future to produce new insights and solutions to business, social, and other problems that we have today or might arise tomorrow.
As part of my effort to learn about that intersection between data and analysis, I recently embarked on a learning journey about a space I knew nothing about: machine learning.
Having spent a year learning about and working in the blockchain space, and having spent many months regularly learning about cryptography on my own, I’d developed an appreciation for the power of open-sourcing your learning journeys: sharing what you’re learning, thinking about, and asking.
By doing so, not only do you learn more quickly, but also you’re able to share the spoils of that accelerated learning process with other people.
Below, I’ve cross-posted the first three of my daily blog entries from A Human Learning Machine Learning.
Thanks for joining me on this learning journey.
Jan 1, 2019: My Open-Source Machine Learning Journey BeginsJan 2, 2019: Resources for the Newcomer to Machine LearningJan 3, 2019: Should Machines Learn to Process Speech before They Learn to Process Text?January 1, 2019: My Open-Source Machine Learning Journey Begins(Cross-posted from My Open-Source Machine Learning Journey Begins)Earlier today on LinkedIn, I wrote the following:2018 was a year of learning…2019 will be a year of dual-wielding, of learning and applying: learning more about distributed systems and decentralized blockchains, learning about machine learning, experimenting with applications, applying my heart to clearer goals, learning to apply ever-stronger curiosity to my mission, and allowing myself to fully explore my hunches.
A little over one year ago, I began to learn about blockchains, distributed ledgers, and cryptography.
Once you learn the basics of what makes these work together, it’s easy to become excited about the future of information.
In a world that will increasingly rely on ways to store, access, distribute, and analyze data from a combination of centralized, distributed, and decentralized stores, the essential follow-up question becomes, “How will humans use data to build machines that help advance the aims of our societies?”This thinking prompted me two weeks ago to consider what else I should be learning in the year(s) ahead, which includes machine learning itself.
Believing as I do that technology can and should be used for good, and that we ought to use our best ideas in service of our best ideals, I can’t help but try to understand such an important piece of modern technology’s surface area, namely the intersection of machine learning, neural networks, artificial intelligence, and our chosen tomorrow.
As I wrote in the rest of today’s LinkedIn post:“In a time of such great problems, we can’t afford not to throw ourselves, with great delight and strong conviction, at reducing the number of mysteries that barr human progress.
”In this blog, my aim is to “open-source my learning journey in machine learning”, so that my current peers and future generations can see what it’s like a for a human to learn about a technology that will someday power many foundations of the human experience.
As I’ve recently started to read about machine learning, to ask questions, and to ponder the unknown, this space is to serve as part diary, part blog, and part playground to build and share my neural networks in the future.
The wonderful thing about the human learning process is that it is one part neuroscience and one part magic, the output of the work of our biological neurons tussling and wrestling with connections, ideas, dead ends, mistakes, breakthroughs, awe, wonder, curiosity, memories, problems, and the dual selective pressures of life and imagination.
Someday, in the very far-off future, long after I’m gone, humanity may create learning machines capable of appreciating all the joys contained in that previous sentence.
Until then, I and we have a lot to learn about the human mind, about computers, about data, and how that data can be used to train computers to make better predictions on our behalf.
What will I find in this machine learning journey of the mind?January 2, 2019: Resources for the Newcomer to Machine Learning(Cross-posted from Resources for the Newcomer to Machine Learning)Learning how to learn something new is always tricky.
It’s a bit like figuring out how to build a shelf to put new knowledge on before you know what kind of knowledge you’ll acquire, how long or high the shelves should be, or which tools you’ll need to build the shelves themselves.
I guess sometimes the best way to start is to, well, start.
You build as you go, and you ask to borrow tools as needed or help building shelves when you need to.
You also determine what’s irrelevant to your aims or too far beyond your current knowledge horizon to be useful now.
That’s the game of unstructured trial and error — the human version of reinforcement learning.
But first, you must start.
One way to divide the start is by segment, i.
, by source or type of resource:Books to readPeople to followCourses to takeAnd so forth.
Learning SegmentsBelow are some of the machine learning resources that I’ve come across so far or have been suggested to me.
This list is unprioritized and inexhaustive.
It is a snapshot in time, so I do not anticipate adding to this list in the future.
BooksAn Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert TibshiraniThe Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani, and Jerome FriedmanFuzzy Sets and Fuzzy Logic: Theory and Applications by George J.
Klir and Bo YuanLife 3.
0: Being Human in the Age of Artificial Intelligence by Max TegmartMachine Learning: The New AI by Ethem AlpaydinThe Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro DomingosNexus (The Nexus Trilogy Book 1) by Ramez Naam (this is the only fiction book on the list)Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering by Steven H.
StrogatzOn Intelligence: How a New Understanding of the Brain Will Lead to the Creation of Truly Intelligent Machines by Jeff Hawkins and Sandra BlakesleeOur Mathematical Universe: My Quest for the Ultimate Nature of Reality by Max TegmartCoursesApplying Machine Learning to your Data with GCP by Google Cloud Training at Coursera“In-depth Introduction to Machine Learning in 15 Hours of Expert Videos” by Kevin Markham (2014) at R-bloggers or DataSchool.
io (this supplementary resource comes highly recommended and includes the slides and videos to a course by the authors of An Introduction to Statistical Learning with Applications in R, above)Neural Networks and Deep Learning by Andrew Ng, Kian Katanforoosh, and Younes Bensouda Mourri at CourseraNeural Networks for Machine Learning — Geoffrey Hinton 2016, a 78-video playlist by Colin McDonnell at Youtube (the Coursera course itself appears to no longer be available)Long Reads“Markov Chain Monte Carlo Methods, Rejection Sampling and the Metropolis-Hastings Algorithm” by Brian Keng (2015) at Bounded Rationality“Markov Chain Monte Carlo Models, Gibbs Sampling, & Metropolis Algorithm for High-Dimensionality Complex Stochastic Problems” by Yogesh Malhotra (2015) in The SSRN“Marvin Minsky’s Vision of the Future” / “A.
” by Jeremy Bernstein (1981) in The New Yorker“Neuralink and the Brain’s Magical Future” by Tim Urban (2017) in Wait But Why“Neuroscience-Inspired Artificial Intelligence” by Demis Hassabis, Dharshan Kumaran, Christopher Summerfield, and Matthew Botvinick (2017) in Neuron“One Giant Step for a Chess-Playing Machine” by Steven Strogatz (2018) in The New York Times“Progress Report on Artificial Intelligence” by Marvin Minsky and Seymour Papert (1971) at MITPeoplePieter Abbeel — Berkeley reinforcement learning researcherFrancois Chollet — inventor of the Keras neural network libraryLex Fridman — MIT research scientist and AI podcast hostDemis Hassabis — co-founder of artificial general intelligence research company DeepMindAndrej Karpathy — Tesla’s AI Director with a focus on Autopilot perceptionFei-Fei Li — Stanford professor and computer vision expertAndrew Ng — professor, co-founder of Coursera, & deep learning expertCarol Reiley — roboticist and co-founder of autonomous driving company drive.
aiDaniela Rus — roboticist and Director of MIT’s famed CSAIL laboratoryMiscellaneous TopicsHow the class of problems you wish to solve impact your choice of learning model, machine learning algorithm, and neural networksHow the computational expensiveness of certain neural network calculations (e.
matrix multiplication) constrains your choice of software and hardware (i.
, the GPU-friendliness vs CPU-friendliness of machine learning)The impact your choice of programming languages to learn (e.
, videos vs static images vs speech vs textAs with distributed systems and decentralized blockchains, the fit between what problems you want to solve and what combination of solutions (per above) to employTidbitsLex Fridman’s podcast, Artificial Intelligence Podcast, features interviews with the above-mentioned Pieter Abbeel and Max Tegmart as well as many others in the ML and AI spacesThe subject of generative adversarial networks (GANs) is apparently a hot area of research3Blue1Brown’s video “But what IS a Neural Network?” is the best video I’ve yet seen on perceptrons and the structure & purpose of neural networks“Deep Learning Cars” is a video by Samuel Artz that simulates, in 2D, cars on a race course.
In the video’s description is a link to the source code for the simulationToronto computer hardware company Xanadu is working on advanced AI and “photonic quantum computing” chips to enable quantum applications of machine learningFor some excellent visuals and explanations of various neural network architectures, see the articles “The Neural Network Zoo Prequel: Cells and Layers” and “The Neural Network Zoo” by AI research company The Asimov Institute.
You can also read Andrew Tchircoff’s “The mostly complete chart of Neural Networks, explained” at Towards Data ScienceSome neural networks can be run inside phone applications and are bundled in a file with special extensions (e.
mlmodel for apps running MLModel on the iPhone or .
tflite for apps running TensorFlow Lite on Android devices).
So, by example, iPhone developers can integrate machine learning models into their apps by using Apple’s Core ML framework, the Core ML API, and the MLModel class.
The key point is that some neural networks work with the limited resources of a CPU on something like an iPhone, which is amazing, and perhaps a stepping stone to neural networks running on even lower-power IoT devices in the future.
Amazon AWS Machine Learning currently supports three types of machine learning models, namely binary classification, multiclass classification, and regression, each of which is ideal for making different types of predictionsGame theory; complexity theory; statistics of many kinds, as well as major statistical theories (e.
, Bayesian) and related techniques (e.
, regression); linear algebra, and other academic disciplines appear to be important layers to the work of artificial intelligence.
That is very interesting, something machine learning technologies appear to share with decentralized systems (blockchains): to work, they must draw on principles from economics, statistics, computer science, and mathematicsSome universities offer not only free online courses but also paid certificate programs, such as MIT Online’s professional education and Stanford Online’s graduate education, in various sub-areas of machine learningFor a short technical introduction to prior distributions, likelihood functions, & posterior probabilities in the software Stata, see Chuck Huber’s video “Introduction to Bayesian statistics, part 1: The basic concepts”The MIT Sloan Management Review article “The Machine Learning Race Is Really a Data Race” by Megan Beck and Barry Libert raises good questions about the importance of unique data to train machine learning models used in commercial applicationsToolsAmazon Machine LearningAzure Machine Learning StudioCaffe & Caffe2ColaboratoryDataCampGoogle Cloud AI & Machine LearningPyTorchTensorFlowSupplemental Notes on ToolsColaboratory, “a free Jupyter notebook environment that requires no setup and runs entirely in the cloud”, has a lot to love, but in particular comes with an interactive notebook version of Jake VanderPlas’s book, Python Data Science Handbook: Essential Tools for Working with Data, and links to the self-paced website, Machine Learning Crash Course by GoogleGoogle Cloud Platform has excellent self-paced material in its Google Cloud Training Platform, which includes labs through Qwiklabs and three Data and Machine Learning learning tracks: one for data analysts, one for data engineering, and one for data scientistsFor one opinion on PyTorch vs TensorFlow, see “Tensorflow or PyTorch : The force is strong with which one?” by Yashwardhan JainJanuary 3, 2019: Should Machines Learn to Process Speech before They Learn to Process Text?(Cross-posted from Should Machines Learn to Process Speech before They Learn to Process Text?)Today I asked a question in a comment on LinkedIn.
My question was, essentially, if there is an advantage to teaching machine learning models to process speech before we teach them to process text.
Let me walk you through my reasoning.
On Tools that Process Natural LanguageDr.
Matt Wood of Amazon recently wrote a blog post announcing that AWS Machine Learning can be used by developers to build natural language processing models into their applications.
This relies on a service called Amazon Comprehend, which, as the name implies, can analyze text and perform reading comprehension-related tasks.
The service is trained with deep learning models and is able to detect certain words, perform sentiment analysis on the language of the text, or even categorize the text by certain topics.
Wood’s announcement was about a new capability for Comprehend: 1) the ability to further tailor the service’s searches to a particular organization’s lexicon, and 2) the ability to sort documents into custom classifications (i.
Apparently, this type of search-and-analyze is not easy to do, which is why using a machine learning service like Comprehend is useful.
This all falls within a category of applied machine learning called natural language processing.
Wood’s blog post reminded me of something I’ve seen in Alpaydin’s text, Machine Learning: bag of words.
Bag of words is a technique used to figure out whether or not text shows up inside a given document and then sort that document into a category based on its contents.
This is one of the techniques used for spam filtering, for example.
The day before Wood’s post, Nino Bice of AWS Artificial Intelligence wrote a post titled “Getting Started with Amazon Comprehend custom entities”.
In it, Bice describes how the Entities data type now supports “private, custom entity types”, specific to an organization, that map to individual words of importance to that organization.
Think of this as making the machine learning model more attuned to the language used within a company (or an industry such as healthcare) that is using an app powered by Comprehend’s APIs.
The organization’s custom text data trains Comprehend’s natural language processing model in order to better predict how to classify documents or text when the customer’s models are exposed to those documents or texts in the future, after the learning models have been trained with the customer’s data set.
Presumably, the bag of words technique is used somewhere in the process.
On Primitives of Sight and SoundAt this point, it’s helpful to point to a page in Alpaydin’s book (p.
103, Fig 4.
2) describing how something called hierarchical processing works.
Basically, if you feed an image of text to a computer, and you want that computer to be able to recognize what word is contained in the image, it does that by using models trained to perform hierarchical processing, which begins by detecting the visual primitives of letters, “such as arcs and line segments”.
Alpaydin’s Figure 4.
2 shows how two curves make an o, one vertical line makes an l, and so on.
In hierarchical processing, the machine would process the pixels that make up those primitives, then combine those primitives into single letters to process, then determine if a given word has a combination of letters that it has learned to recognize, and, finally, identity “more abstract relationships such as the relationship between “book” and [the French equivalent] “livre””.
I recall learning as a young boy how to write the letters o, l, and all the rest on big, manuscript ruled D’Nealian paper.
The goal of those scholastic routines must be to teach us to learn, identify, and write all the visual primitives of all the letters in the alphabet, which perhaps improves our ability to subsequently learn, identify, and read all the letters of the alphabet (I can’t recall if we learn to read or write first).
However, most of us learn to do something else before we learn to read or write: we learn to speak.
Speech, like letters, has its own primitives, except that they’re oral (when transmitted) and aural (when received) rather than visual:Alpaydin (p.
67): “Just as we consider each character image to be composed of basic primitives like strokes of different orientations, a word is considered to be a sequence of phonemes, which are the basic speech sounds.
In the case of speech, the input is temporal; words are uttered in time as a sequence of these phonemes, and some words are longer than others.
”We as humans begin to master the phonemes of speech before we master the visual primitives of letters and how they come together to form written words.
This brings me to my question.
Should Our Learning Machines Learn to Listen Before Reading (or Speaking)?Since humans learn to listen and speak before we learn to read or write, I wonder if there would be some computational or other advantage to training speech learning models on speech data before training natural language processing models on text data, especially of the same words spoken as written, in order to improve efficiency or accuracy.
I just don’t yet know enough about how either of these work to have an answer.
Further, is there is a speech recognition equivalent to the bag-of-words technique used for text classification of documents?.Could such a “bag-of-spoken-words” technique be used in combination with techniques for written text improve the models’ predictions?From the question I posed:…would there be a way to take a list of written words that would normally be used to train models for text classification, speak the words aloud and record them, put the speech data through a speech recognition algorithm to train a speech recognition NLP model, and then use the results to train Amazon Comprehend for text classification?.Reason I ask is that in human learning, when we learn to read words, we likely associate the visual primitives of the letters with the speech primitives of the spoken versions of those letters, which I suppose accelerates the ability to read and process words because we can speak their spoken counterparts.
Thus, I wonder if in machine learning we are yet approaching things in a similar way, coupling NLP for text with NLP for speech during the training process.
There is much to explore.
Further Learning Machine Learning by Ethem Alpaydin (2016) “Deep learning” — Wikipedia Amazon Comprehend FAQs and Entity — AWS “Sentiment analysis” — Wikipedia “Natural language processing” — Wikipedia “Bag-of-words model” and “Naive Bayes spam filtering” — Wikipedia Amazon Comprehend Medical — AWS “Phonemes” — WikipediaUntil Next TimeAnd there you have it, blogs #1–3 on my new machine learning blog, A Human Learning Machine Learning.
If you like what you’ve read, go ahead and add me on LinkedIn, bookmark my blog or add it to your favorite RSS / Atom reader, and share this Medium article (or the original links) with others.
If you like my writing and want to add me as a contributing writer to submit this or other articles to your Medium publication, let me know.
Finally, if you’re a startup or tech company in need of a freelance writer for your blockchain, machine learning, or other project, send me a message on LinkedIn, via email at alexoblockchain@gmail.
com, or as a note here.
I’m currently accepting clients.
Thanks for reading.
.. More details