Harvey Alférez, Ph.D
Data Scientist, School of Engineering and Technology, Montemorelos University, Mexico
Big data and data science are concepts that you may have heard in the news or in meetings. But what do they mean? How can you use them to support the church’s mission at the NAD? Which easy-to-use (or even free) tools can be used to get the most out of data? A series of articles on this blog will try to answer these questions. This first introductory article describes the generalities of big data and data science. Also, it describes their potential for the church. Subsequent articles will focus on more technical and methodological details.
What is Big Data?
Big data is a term that can be used to describe datasets so large and complex that they become difficult to work with using standard techniques . The digital universe is huge, doubling in size every two years. By 2020 it will reach 44 zettabytes, or 44 trillion gigabytes . This fact has motivated companies and scientists around the world to find new ways to understand big data in the digital universe. Big data is definitely the next big thing, so much so that people are saying big data is the new oil .
Big datasets tend to be more unstructured, distributed, and complex than ever before. The most relevant characteristics of big data can fall into three dimensions : 1) the volume of information that systems must ingest, process, and disseminate; 2) the velocity at which information grows or disappears; and 3) the variety in the diversity of data sources and formats. Even IBM proposes another dimension: the veracity of uncertain data .
There are tons of big data that our church can dig into, both internally and externally. In the case of external data, we can make use of open data, which is data that anyone can access, use or share . For example, Data.gov offers a lot of ready-to-use open datasets offered by the U.S. Government. By means of Data.gov, church management, pastors, and members can find datasets related to health, education, and much more. Can you imagine if we make use of all that data to understand the current needs of our communities (e.g. in terms of health-related issues)?
Big data opens new opportunities for the Seventh-day Adventist Church. For instance, a previous research work shows how big data was used to understand how people perceive our church’s fundamental beliefs . The dataset used in the experiments of that research work was composed of digitized texts containing about 4 percent of all books ever printed between 1800 and 2008.
What is Data Science?
Data science can be defined as the study of the generalizable extraction of knowledge from data . Data science calls for multi-disciplinary approaches that incorporate theories and methods from many fields including mathematics, statistics, pattern recognition, knowledge engineering, machine learning, high performance computing, etc. Furthermore, data science is the science about data .
In an article published in Harvard Business Review , Davenport and Patil state that the data scientist possesses the training and curiosity to make discoveries in the world of big data. He/she is a hybrid of data hacker, analyst, communicator, and trusted adviser. More than anything, what data scientists do is make discoveries while swimming in data.
It is important to mention that data science is not restricted to only big data. However, the fact that data is scaling up and the invention of new tools for big data analysis open a new era for data science .
An article in Adventist Review  describes how data science has been used to understand the needs of people in New York City. This metropolis has central significance in our church’s ongoing Mission to the Cities project. Specifically, machine learning was used to analyze the sentiments of people from tweets related to several topics.
As Seventh-day Adventists, our mission is to fulfill the Great Commission. In order to do it, we are called to follow Christ’s method: to understand and meet people’s needs. What if we carry out data-driven strategies, make use of big data technologies, and leverage data-science techniques to understand and meet those needs? Think about this question until our next article!
1. C. Snijders, U. Matzat, and U. D. Reips, “ ‘Big Data’: Big Gaps of Knowledge in the Field of Internet Science,” International Journal of Internet Science 7, no. 1 (2014): 1-5.
2. EMC Corporation, “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things” (2014), www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm.
3. P. Rotella, “Is Data the New Oil?” (2012), www.forbes.com/sites/perryrotella/2012/04/02/is-data-the-new-oil/.
4. Z. Wu and O. Beng Chin, “From Big Data to Data Science: A Multi-disciplinary Perspective,” Big Data Research 1, no. 1 (2014): 1.
5. IBM, “The Four V’s of Data,” (n.d.), http://www.ibmbigdatahub.com/infographic/four-vs-big-data.
6. Open Data Institute, “What is Open Data?” (n.d.), http://theodi.org/what-is-open-data.
7. G.H. Alférez, “Big Data for Reaching a Big World,” Adventist Review 192, no. 11 (2015), 47-51.
8. V. Dhar, “Data Science and Prediction,” Communications of the ACM 56, no. 12 (2013): 64-73.
9. T.H. Davenport and D.J. Patil, “Data Scientist: the Sexiest Job of the 21st Century,” Harvard Business Review 90, no. 10 (2012): 71-76.
10. G.H. Alférez, “Tweeting in New York City - Data Science Can Teach Us to Sympathize,” Adventist Review 193, no. 2 (2016): 47-49.