The Digital Survivors — Analytic Perspective of the HolocaustYoav TepperBlockedUnblockFollowFollowingMay 7“Soon, there will be no survivors to tell their stories, and your children would hear about the Holocaust only by books and videos”My teacher said this about 18 years ago after showing my class a documentary film about Holocaust survivors.
Photo by Alessio Maffeis on UnsplashThe Holocaust, also known as “The Shoah” (Hebrew: השואה), was a genocide during World War II in which Nazi Germany, aided by local collaborators, systematically murdered ~6,000,000 Jews.
As a grandson of Holocaust survivors and an avid data researcher, I thought to myself that nowadays, in the “Data Science” era — where big data can be found with small efforts — I could find some raw data about Holocaust victims via a short Google search.
Surprisingly — my Google searches yielded almost nothing.
The only significant “Holocaust database” I could find was “Yad Vashem” — The World Holocaust Remembrance Center, which contains millions of records with biographical details of Holocaust victims, records which were carefully gathered from 2004 up to this day.
Even though this database is publicly accessible, the access is via an online (and limited) query form, which makes it impossible to manipulate the data with more suitable analysis tools.
Can it be that the data of one of the most significant episodes in world history is not available in its raw form for non-profit research?So I decided to investigate “Yad Vashem” website (technical note: understanding the “hidden API” by filtering XHR and fetch requests via chrome DevTools), and was able to create a Python script to automatically query and store the data of ~7.
5 million entries of Holocaust victims (note: it was all done according to “proper usage” and “privacy” terms of the “Yad Vashem” website).
Even though “Yad Vashem’s” information is far from being complete (duplicate entries, missing entries, unknown sources, etc.
) — it is still the best there is, and with this exclusive data in hand, I could now see the Holocaust in a convenient yet disturbing perspective:Europe heat map — victims per country of death (note: borders are not accurate as I used a modern geocoding engine instead of a 1938 one)Reason of death — per country of residenceNumber of victims between 1938–145 — colored by documented fateTop originating cities of the victims — colored by country of deathMain traffic routes of victims between 1938–1945Needless to say, this is just the tip of the iceberg; these are just a few summaries I made with “Tableau” and “Python”, they do not claim to be accurate, but to emphasize how such data could be utilized.
Imagine that we would take into account relationships between surnames and geographic locations of victims, or if we would cluster groups of victims by similarity of their locations in certain dates.
One can imagine that a comprehensive study may provide new insights which could benefit the international community, from historical insights regarding the war to discovering further information on one’s family fate.
Unfortunately, there are almost no Holocaust survivors left, and time doesn’t make it better.
But in a reality where the human survivors disappear, let us at least utilize the “digital survivors” to tell their stories.
Written by Yoav Tepper, grandson of Holocaust survivors.
I contacted “Yad Vashem,” asking if they would allow me to publish their database.
I truly believe that if anyone in the world could access this exclusive and important information, it might initiate/encourage new research that was never made in memory of these victims.
I am still waiting for “Yad Vashem’s” official response, but I will post an update here in the comments section if they will approve my request.
================According to the terms and conditions of the “Yad Vashem” website, I declare that I do not own any rights for the data.
The use of “Yad Vashem” data was made for personal and educational purposes only.
The data belongs to “Yad Vashem,” and the only valid data is the one presented in the “Yad Vashem” website.
In this article, I used country names, years, and causes of death.
No personal information related to the identities of the victims was used.