BIG, small or Right Data: Which is the proper focus?

These humongous data sets are collected via many different means including computer networks, social media profiles, web browsing histories, mobile phone sensors, Internet of Things (IoT) devices, video data from (self-)driving and robotic applications, our commercial transactions and more.The complex task of processing and analyzing big data has pushed computer engineering and computer science on several fronts, such as distributed parallel processing (e.g., map-reduce and streaming architectures) and machine learning (e.g., deep learning)..Displayed graphically below, with companies plotted in one axis and the amount of data that they can gather in the other, we see that data generated by most companies forms the torso and the diverse long tail of data at large.Big data has the benefit of malleability, meaning we can use big data to generate small data..One of the most common purposes of big data is to produce myriads of coherent, specialized small data sets, often created just from the transformation process itself..Some people claim that “small data is the new big data” [7, 8], that “small data is the real revolution” [2] or that “small data is where the money lies” [9]..In fact, extremely small data contributes to our yes-or-no-decisions for any important choice, making how much data we need to determine a given decision a primary concern..First ask yourself a few questions:Which type of data do I need?.If we factor in that most target data is personal and lives in a tiny, portable device, then we must preserve privacy and/or resolve the problem in a device that has limited computing power, memory, communication, and energy.Therefore, due to small data’s ubiquitous presence and large impact in the world of SMEs and individuals, it is crucial to understand it well..Additional aspects of data might be important for most applications, including how data is collected, the technology and software used, the data ontology employed, and the context in which the data is generated [12].Lately, the use of small data has awakened interest among the scientific e-health community..Deborah Estrin [4] defines small data as “the picture of your personal health.” She spearheads initiatives that liberate data to the consumer, arguing that digital behavior leads to valuable knowledge about an individual’s personal health (e.g., those cases (maybe the most interesting ones) digital devices should be able to analyze the data locally, triggering an alarm only in case of emergencyIn this and other contexts, small personal data refers to information related to a living individual..First, it is difficult to gather; personal data is considered private or sensitive and most people are not comfortable sharing it..We also need to explore the limits of small data and exacting its uses..Then we can learn if the right data, needs to be small or big.. More details

