Things I Learned From the SciPy 2019 Lightning Talks

numba is pretty easy to use: no build steps (compare with Cython), just Python code.

Original problem: parse 150 MB of Avro bytes to Pandas DF with fastavro – took 75 secondsNew solution: 5 secondsThis new solution uses Arrow as an intermediate format, as it better supports NULL values and there is a clear path from bytes -> Arrow arrays & table -> Pandas DF.

Fast things make Tim happy.

  napari – Multidimensional image viewer for Python, Nicholas Sofroniew (I believe)Short overview of image viewer which:  Leveraging open source software to recreate the coastline, Kim PeveyProblem: Aerial image modeling to digitize coastline traditionally took a long time (weeks – months)Solution: Open source tools changes this to 3-5 minutes!  Ibis: Python data analysis productivity framework, Ivan OgasawaraWhy use Ibis?  Dask-Gateway, Jim CristDask is like Spark, but “friendly and flexible,” and in Python.

Where to deploy Dask? Kubernetes, YARN, HPC ClustersDask-Gateway is to accommodate common feature requests from some users which seem above and beyond “just Dask,” such as:”Like JupyterHub, but for Dask”Not well documented 🙂 What there is can be found here.

  Frankensteins model, or understanding your monster, Dillon NiederhutThe factorials in the above equation generally make this non-tractable for any non-trivial problem.

SHAP takes a stab at doing this, and for some models it has been shown that the time required is quadratic to the depth of the trees predicting on.

How can we make a non-linear model more easily interpretible?.Train a linear model to predict the behavior of the non-linear one; this linear model would then be much more interpretible.

The tornado image above can show correlations of features with positive and negative outcomes (in this case, burrito quality), which are learned from the complex non-linear model (in this case, random forest).

Can generate actionable insights as an outcome, yet still harness the power of a more complex non-linear model to do the heavy lifting.

  LFortran – A modern ineractive LLVM-based Fortran compilerDo you use Python for prototyping and algorithm development, but then have to translate Python to a production language manually in order to see the actual benefits?.If your production is Fortran, why not directly compile and execute Fortran in Jupyter notebooks instead!The project is in early stages, with a roadmap, envisions a scenario where Fortran can be used in Jupyter to prototype and then have the notebook output modern, usable Fortran for a production environment, among other uses.

  Where to put your custom code?, Chris BarkerThis talk is a suggestion for how to manage your personal library of python functions you might use for scripting, data analysis, etc.

Python packaging isnt just for distribution, but can be used for this purpose as well.

Just dont take the step of putting your personal packages on PyPI.

How?.Its easy; use the proper tree directory:Though Python package documentation says the setup.

py requires requirements, metadata, and more for distribution, but this can be overcome with a few simple lines for your own packages:Thats it!.To make it simper, Chris has created a script which does all of this packaging for you.

  Lets Do This Together: Diversity and Inclusion, and Mental Wellness, Steven SilvesterThis talk is a brief discussion of the benefits of diversity and inclusion, and how it relates to open source development, coding and more.

This is followed up with a personal story of mental wellness and how to get a handle on it.

  The mouse aging cell atlas aka Cell biology meets Python, Angela Oliveira PiscoUsing scientific python, Angela and team have been able to manage .

5 million cells across 20 tissues across 6 different mouse ages.

Cells were classified, an annotated collection of cells was created, and this was made public for anyone to use.

As a result, they were able to answer numerous cellular biology questions after classification and cluster analysis.

An example of this is “do the fraction of cells responsible for a particular poor biological behavior change over time?”  Related: var disqus_shortname = kdnuggets; (function() { var dsq = document.

createElement(script); dsq.

type = text/javascript; dsq.

async = true; dsq.

src = https://kdnuggets.

disqus.

com/embed.

js; (document.

getElementsByTagName(head)[0] || document.

getElementsByTagName(body)[0]).

appendChild(dsq); })();.. More details

Leave a Reply