Research in Big Data News

Research in Big Data


Research in Big Data provides information about the research on applications of big data in a variety of industries including education, transportation, government, and commercial. Research in Big Data examines some of the new technologies available to improve existing systems. Some of these new technologies include cloud computing, new forms of database management systems, and developments in machine learning, artificial intelligence, data mining, and data analysis.

Data Analysis

Data Analysis

Technologies for analyzing data.

Big Data

Big Data

The management of large amounts and large variety of data.


Visualization and Reporting

Dashboards and reporting of data.

Research in Big Data News

Research in Big Data

Sep 1, 2019 Issue 12

Big Data Applications in Industry

Big data analysis may continue to impact a number of industries (Koyce, 2017). Koyce (2017) explains a number of industries that have benefited from the analysis of big data including social media, marketing, legal, and consumer industries. Amado, Cortez, Rita, and Moro (2018) implement topic modeling to analyze relevant topics for big data in marketing. O'Neal (2016) describe some features of building effective ecosystems of information and communication and highlight the successful strategies of networks, fixed costs, asset scale, rapid development, and target marketing.

Big data analysis may also be instrumental for science and education (Lettieri et al., 2018). Lettieri et al. (2018) describes a number of areas that can be improved with computational science including literature analysis, data sharing, collaboration, code sharing, and research analysis. Belle Selene (2017) highlights the importance of web programming in educational themes such as distance learning, project based learning, and collaborative active learning. Kim, Jung, Geierhos, and Kersting (2017) propose architecture similar to data stream management for implementing devices for climate analysis.

Amado, A., Cortez, P., Rita, P., & Moro, S. (2018). Research trends on big data in marketing: A text mining and topic modeling based literature analysis. European Research on Management and Business Economics, 24(1), 1-7. doi:10.1016/j.iedeen.2017.06.002

Belle Selene, X. (2017). An in-depth analysis of teaching themes and the quality of teaching in higher education: Evidence from the programming education environments. International Journal of Teaching & Learning in Higher Education, 29(2), 245-254.

Kim, T., Jung, H., Geierhos, M., & Kersting, J. (2017). Internet of Things architecture for handling stream air pollution data. Paper presented at the Proceedings of the 2nd International Conference on Internet of Things, Big Data and Security.

Koyce, K. (2017). The challenges of using big data effectively critical analysis of the phenomenon of big data through the parameters of the end-user, industry uses and legal considerations.

Lettieri, N., Altamura, A., Giugno, R., Guarino, A., Malandrino, D., Pulvirenti, A., . . . Zaccagnino, R. (2018). Ex machina: Analytical platforms, law and the challenges of computational legal science. Future Internet, 10(5). doi:10.3390/fi10050037

O'Neal, S. (2016). The personal-data tsunami and the future of marketing: A moments-based marketing approach for the new people-data economy. Journal of Advertising Research, 56(2). doi:10.2501/jar-2016-027

Big Data Innovation and Sustainability

As big data innovation influences information technology sustainable methods of including big data analysis in academia and practice may improve the technology (Hacker, 2018). Hacker (2018) explores techniques for including computational thinking in technology and engineering education to expand fundamental education, broaden computing education, and increase the status of technology and education in the educational system. Cicmil, Lindgren, and Packendorff (2016) describe the impact that vulnerability and resilience have on developing sustainable projects. Gomera, Oreku, Apiola, and Suhonen (2017) suggest that technological innovation can enhance small financial institutions and businesses.

Big data technologies may impact political and economic research (Briguglio, 2016). Briguglio (2016) include measures of political governance in the assessment of the ability of the economy of a country to withstand from external influences. Kern, Stuart, Hill, and Green (2016) describe limitations and practical constraints when designing research for target populations. Sloan, Yang, and Wang (2015) describe techniques for resolving queries by analyzing adjacent and non-adjacent subqueries through documents for interpreting scenario based predictions.

For example, regulations amidst concerns of privacy and security may have an impact on big data technological advances (Tendam, 2018). Tendam (2018) describes how the Privacy Rule in the Health Insurance Portability and Accountability Act improve the privacy of electronic health information and makes exceptions for disclosing information to government organizations. Meijer and Thaens (2018) describe how subtle forms of surveillance can increase safety in smart cities and also presents the risk of less privacy. Drabiak (2017) explains how the government can access genomic and health information as part of law enforcement.

Briguglio, L. P. (2016). Exposure to external shocks and economic resilience of countries: Evidence from global indicators. Journal of Economic Studies, 43(6), 1057-1078. doi:10.1108/jes-12-2014-0203

Cicmil, S., Lindgren, M., & Packendorff, J. (2016). The project (management) discourse and its consequences: On vulnerability and unsustainability in project-based work. New Technology, Work and Employment, 31(1), 58-76.

Drabiak, K. (2017). Caveat emptor: How the intersection of big data and consumer genomics exponentially increases informational privacy risks. Health Matrix, 27, 143-182.

Gomera, W. C., Oreku, G., Apiola, M., & Suhonen, J. (2017). Mobile training in micro business: Design science research for frugal innovation. Paper presented at the IEEE Africon 2017 Proceedings.

Hacker, M. (2018). Integrating computational thinking into technology and engineering education. Technology and Engineering Teacher, December/January 2018, 8-14.

Kern, H. L., Stuart, E. A., Hill, J., & Green, D. P. (2016). Assessing methods for generalizing experimental impact estimates to target populations. J Res Educ Eff, 9(1), 103-127. doi:10.1080/19345747.2015.1060282

Meijer, A., & Thaens, M. (2018). Quantified street: Smart governance of urban safety. Information Polity, 23(1), 29-41. doi:10.3233/ip-170422

Sloan, M., Yang, H., & Wang, J. (2015). A term-based methodology for query reformulation understanding. Information Retrieval Journal, 18(2), 145-165. doi:10.1007/s10791-015-9251-5

Tendam, M. L. (2018). The HIPAA-Pota-Mess: How HIPAA's weak enforcement standards have led states to create confusing medical privacy remedies. Ohio St. LJ, 79, 411.

Big Data Technology Companies and Trade Regulations

As big data technology companies grow, increased regulations may restrict how they control certain types of data (Green, 2018). Wang and Li (2017) describe how centralized planning and market simulation characterize a planned economy and how big data analytics may play a part in returning to a planned economy. Kumar (2017) details how international trade affects the enforcement of antitrust laws. Green (2018) explains how the creation of universal electronic health records could potentially violate antitrust laws.

An increase in software patents may also impact trade and competition (Mattioli, 2017). Chien (2016) describes how software patents have increased in value and trade between companies. Allison, Lemley, and Schwartz (2017) study how companies that control many patents for the purpose of litigation are not as successful as those companies that operate with their patents in lawsuits. Mattioli (2017) describes how the collection of large amounts of data in big data could lead to competition concerns and reduce innovation.

Emerging models for regulation may also influence how big data technology is regulated (Tor, 2014). Purdy (2018) describes the challenge of enforcing environmental laws as the rise of environmental law lacks the textual basis of constitutional law. Kaal and Vermeulen (2017) compare different techniques for regulating technological innovation such as principle based regulation or competition regulation. Tor (2014) describes behavioral antitrust as an emerging research that employs empirical analysis to determine antitrust regulations.

Allison, J. R., Lemley, M. A., & Schwartz, D. L. (2017). How often do non-practicing entities win patent suits? Berkeley Technology Law Journal(32), 237-310. doi:10.15779/Z38GM81P03

Chien, C. V. (2016). Software patents as a currency, not tax, on innovation. Berkeley Technology Law Journal, 31, 1669-1723.

Green, K. (2018). The universe in the palm of your hand: How a universal electronic health record system could improve patient safety and quality of care. DePaul Journal of Health Care Law, 19(2), 2.

Kaal, W. A., & Vermeulen, E. P. M. (2017). How to regulate disruptive innovation -- from facts to data. Jurimetrics: The Journal of Law, Science & Technology, 57(2), 170-190.

Kumar, S. (2017). Patent damages without borders. Texas Intellectual Property Law Journal, 25, 73-113.

Mattioli, M. (2017). The data-pooling problem. Berkeley Technology Law Journal, 32, 179-236. doi:10.15779/Z38R785P10

Purdy, J. (2018). The long environmental justice movement. Ecology Law Quarterly(44), 809-864.

Tor, A. (2014). Understanding behavioral antitrust. Texas Law Review, 92, 573-667.

Wang, B., & Li, X. (2017). Big data, platform economy and market competition: A preliminary construction of plan-oriented market economy system in the information era. World Review of Political Economy, 8(2), 138-161.

Blockchain Technologies

Blockchain technologies may emerge to be an instrumental component for the next generation of information technology systems (Zou et al., 2018). Zou et al. (2018) describe how the basis of the blockchain is a consensus protocol which does not require a central authority. Zhang, Zheng, Gong, and Gu (2018) discuss how blockchain technologies can be implemented for access control in cloud environments.

Yin, Wen, Li, Zhang, and Jin (2018) explain vulnerabilities to blockchain technologies with the possibility of quantum computing for transaction attacks. Gao et al. (2018) propose a signature verification process to mitigate the risk of quantum computing on blockchain technologies. Miraz, Ali, Excell, and Picking (2018) explain how blockchain decentralization can enforce security for fog, edge, and cloud computing.

There are additional blockchain applications outside of financial systems (Liang, Weller, Luo, Zhao, & Dong, 2018). Liang et al. (2018) list applications of blockchain outside the financial system such as robotic knowledge management, educational records, energy trading systems, and smart home environments. Jesus, Chicarino, de Albuquerque, and Rocha (2018) suggest how blockchain technologies can be implemented in securing Internet of Things devices such as increasing decentralization of networks, coordinating devices, and developing new techniques for conducting transactions.

Blockchain technologies are also being implemented to secure transactions (Hair Jr, Harrison, & Risher, 2018). Hair Jr et al. (2018) describe how blockchain technologies are being implemented in logistics for managing customer disputes and tracking packages. Cordella, Paletti, Chun, Adam, and Noveck (2018) provide an example of blockchain technologies implemented for instant property transactions. Chaieb, Yousfi, Lafourcade, and Robbana (2018) explain the possibility of blockchain for online e-voting systems.

Chaieb, M., Yousfi, S., Lafourcade, P., & Robbana, R. (2018). Verify-Your-Vote: A Verifiable Blockchain-based Online Voting Protocol. Paper presented at the 15th European Mediterranean and Middle Eastern Conference on Information Systems, Limassol, Cyprus.

Cordella, A., Paletti, A., Chun, S. A., Adam, N. R., & Noveck, B. (2018). ICTs and value creation in public sector: Manufacturing logic vs service logic. Information Polity, 23(2), 125-141. doi:10.3233/ip-170061

Gao, Y.-L., Chen, X.-B., Chen, Y.-L., Sun, Y., Niu, X.-X., & Yang, Y.-X. (2018). A secure cryptocurrency scheme based on post-quantum blockchain. IEEE Access, 6, 27205-27213. doi:10.1109/access.2018.2827203

Hair Jr, J. F., Harrison, D. E., & Risher, J. J. (2018). Marketing Research in the 21st Century: Opportunities and Challenges. Revista Brasileira de Marketing, 17(05), 666-699. doi:10.5585/bjm.v17i5.4173

Jesus, E. F., Chicarino, V. R. L., de Albuquerque, C. V. N., & Rocha, A. A. d. A. (2018). A survey of how to use blockchain to secure Internet of Things and the stalker attack. Security and Communication Networks, 2018, 1-27. doi:10.1155/2018/9675050

Liang, G., Weller, S. R., Luo, F., Zhao, J., & Dong, Z. Y. (2018). Distributed blockchain-based data protection framework for modern power systems against cyber attacks. IEEE Transactions on Smart Grid, 1-1. doi:10.1109/tsg.2018.2819663

Miraz, M., Ali, M., Excell, P., & Picking, R. (2018). Internet of Nano-Things, Things and Everything: Future growth trends. Future Internet, 10(8). doi:10.3390/fi10080068

Yin, W., Wen, Q., Li, W., Zhang, H., & Jin, Z. (2018). An anti-quantum transaction authentication approach in blockchain. IEEE Access, 6, 5393-5401. doi:10.1109/access.2017.2788411

Zhang, J., Zheng, L., Gong, L., & Gu, Z. (2018). A survey on security of cloud environment: Threats, solutions, and innovation. Paper presented at the 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC).

Zou, J., Ye, B., Qu, L., Wang, Y., Orgun, M. A., & Li, L. (2018). A proof-of-trust consensus protocol for enhancing accountability in crowdsourcing services. IEEE Transactions on Services Computing, 1-1. doi:10.1109/tsc.2018.2823705

Energy Efficiency and Cloud Technologies

Energy efficient methods of data center management may reduce the carbon footprint of cloud technologies (Alalawi & Daly, 2017). Alalawi and Daly (2017) describe a number of techniques for improving energy efficiency for inactive nodes including clustering, dividing clusters based on resources, separating batch and interactive loads, and duplicating subsets of nodes. Krishnamoorthy and Sundaram (2018) develop a security assurance model based on similarity based clustering for evaluating cloud environments. Yousafzai et al. (2016) predict more heterogeneous pools of data centers in cloud computing such as multiple core processing units and collocated virtual machines.

Radu (2017) explains how efficient resource management can improve cloud computing performance, reduce energy consumption, and reduce financial costs. Shatnawi, Munson, and Thao (2017) present a system for maintaining essential digital documents offline without relying on a central repository. Wen, Yan, Zhang, Chinh, and Akcan (2016) explain how geographic information systems can apply to the optimization of shipping patterns.

Cloud technologies may also improve geographic information systems in the ability to monitor and regulate geological systems. Blume, Scott, and Pirog (2014) explain how geo-spatial data has influenced policy analysis for analysts and policy makers. Sybba et al. (2017) describe how the Bayes’ Theorem can be applied to identifying the location of flight crashes by optimization of last known points. Raja, Çiçek, Türkoglu, Aydin, and Kawasaki (2016) apply logistic regression to measure landslide susceptibility with machine learning.

Song et al. (2017) explain how big data approaches can assist in the analysis of complex hydrogeological, environmental, meteorological and economic data. Foga et al. (2017) demonstrate a classification algorithm and validated against similar algorithms for measuring cloud cover in enabling remote sensing. Sava, Clemente-Harding, and Cervone (2016) apply parallel computing to automate image classification of geospatial datasets to support remote sensing.

Alalawi, M., & Daly, H. (2017). A Survey on Hadoop MapReduce energy efficient techniques for intensive workload. Paper presented at the Proceedings of the International Conference on Big Data and Internet of Thing - BDIOT2017.

Blume, G., Scott, T., & Pirog, M. (2014). Empirical Innovations in Policy Analysis. The Policy Studies Journal, 42(S1), S33-S51.

Foga, S., Scaramuzza, P. L., Guo, S., Zhu, Z., Dilley, R. D., Beckmann, T., . . . Laue, B. (2017). Cloud detection algorithm comparison and validation for operational Landsat data products. Remote Sensing of Environment, 194, 379-390. doi:10.1016/j.rse.2017.03.026

Krishnamoorthy, P., & Sundaram, S. (2018). Similarity-based clustering and security assurance model for big data processing in cloud environment. Economic Computation and Economic Cybernetics Studies and Research, 52(2/2018), 175-200. doi:10.24818/18423264/

Radu, L.-D. (2017). Green cloud computing: A literature survey. Symmetry, 9(12). doi:10.3390/sym9120295

Raja, N. B., Çiçek, I., Türkoglu, N., Aydin, O., & Kawasaki, A. (2016). Landslide susceptibility mapping of the Sera River Basin using logistic regression model. Natural Hazards, 85(3), 1323-1346. doi:10.1007/s11069-016-2591-7

Sava, E., Clemente-Harding, L., & Cervone, G. (2016). Supervised classification of civil air patrol (CAP). Natural Hazards, 86(2), 535-556. doi:10.1007/s11069-016-2704-3

Shatnawi, A., Munson, E. V., & Thao, C. (2017). Maintaining integrity and non-repudiation in secure offline documents. Paper presented at the Proceedings of the 2017 ACM Symposium on Document Engineering - DocEng '17.

Song, M., Cen, L., Zheng, Z., Fisher, R., Liang, X., Wang, Y., & Huisingh, D. (2017). How would big data support societal development and environmental sustainability? Insights and practices. Journal of Cleaner Production, 142, 489-500. doi:10.1016/j.jclepro.2016.10.091

Sybba, Patel, P., Girdhar, R., Mehta, R., Hora, R., Gupta, R., & Bardi, T. (2017). Bayes’ Theorem and Operations Research. Imperial Journal of Interdisciplinary Research (IJIR), 3(10), 404-412.

Wen, R., Yan, W., Zhang, A. N., Chinh, N. Q., & Akcan, O. (2016). Spatio-temporal route mining and visualization for busy waterways. Paper presented at the 2016 IEEE International Conference on Systems, Man, and Cybernetics, Budapest, Hungary.

Yousafzai, A., Gani, A., Noor, R. M., Sookhak, M., Talebian, H., Shiraz, M., & Khan, M. K. (2016). Cloud resource allocation schemes: Review, taxonomy, and opportunities. Knowledge and Information Systems, 50(2), 347-381. doi:10.1007/s10115-016-0951-y

Feature Selection in Data Analysis

Feature selection allows the construction of models by selecting a subset of features from a larger collection (Zhang, Zheng, Gong, & Gu, 2018). Alalga, Benabdeslem, and Taleb (2015) compare supervised, semi-supervised, and unsupervised feature selection and the increasing difficulties for each category in machine learning. Si, Yu, and Ma (2016) implement stochastic neighbor embedding to create a two dimensional visualization of high dimension feature selection data in deep neural networks.

Feature selection can be a critical process for knowledge discovery in databases for data mining (Urbina Nájera & De la Calleja Mora, 2017). Leutner, Yearsley, Codreanu, Borenstein, and Ahmetoglu (2017) implement two types of feature selection, least absolute shrinkage and selection operator regression and sequential forward selection for image selection. Choi and Lee (2018) combine feature selection and sampling for fraud detection in the Internet of Things financial transactions.

Feature selection may be combined with other data mining algorithms for example to develop network policies for cloud computing (Meng, Qin, Liu, & He, 2018). Yoon, Niu, and Mozafari (2016) implement causal models estimates to describe the effects that may influence the performance of database in simplified formats. Wang et al. (2016) propose an algorithm that is suitable for high dimensional features such as microbial data with large amounts of sequential data.

Alalga, A., Benabdeslem, K., & Taleb, N. (2015). Soft-constrained Laplacian score for semi-supervised multi-label feature selection. Knowledge and Information Systems, 47(1), 75-98. doi:10.1007/s10115-015-0841-8

Choi, D., & Lee, K. (2018). An artificial intelligence approach to financial fraud detection under IoT environment: A survey and implementation. Security and Communication Networks, 2018, 1-15. doi:10.1155/2018/5483472

Leutner, F., Yearsley, A., Codreanu, S.-C., Borenstein, Y., & Ahmetoglu, G. (2017). From Likert scales to images: Validating a novel creativity measure with image based response scales. Personality and Individual Differences, 106, 36-40. doi:10.1016/j.paid.2016.10.007

Meng, Y., Qin, T., Liu, Y., & He, C. (2018). An effective high threating alarm mining method for cloud security management. IEEE Access, 6, 22634-22644. doi:10.1109/access.2018.2823724

Si, Z., Yu, H., & Ma, Z. (2016). Learning deep features for DNA methylation data analysis. IEEE Access, 4, 2732-2737. doi:10.1109/access.2016.2576598

Urbina Nájera, A. B., & De la Calleja Mora, J. (2017). Brief review of educational applications using data mining and machine learning. Revista Electrónica de Investigación Educativa, 19(4). doi:10.24320/redie.2017.19.4.1305

Wang, Y., Li, R., Zhou, Y., Ling, Z., Guo, X., Xie, L., & Liu, L. (2016). Motif-based text mining of microbial metagenome redundancy profiling data for disease classification. Biomed Res Int, 2016, 6598307. doi:10.1155/2016/6598307

Yoon, D. Y., Niu, N., & Mozafari, B. (2016). DBSherlock: A performance diagnostic tool for transactional databases. Paper presented at the Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16.

Zhang, J., Zheng, L., Gong, L., & Gu, Z. (2018). A survey on security of cloud environment: Threats, solutions, and innovation. Paper presented at the 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC).