Iowa Liquor Sales Data Predictive Analysis Using Spark

영어: The paper aims to analyze and predict sales of liquor in the state of Iowa by applying machine learning algorithms to models built for prediction. We have taken recourse of Azure ML and Spark ML for our predictive analysis, which is legacy machine learning (ML) systems and Big Data ML, respectively. We have worked on the Iowa liquor sales dataset comprising of records from 2012 to 2019 in 24 columns and approximately 1.8 million rows. We have concluded by comparing the models with different algorithms applied and their accuracy in predicting the sales using both Azure ML and Spark ML. We find that the Linear Regression model has the highest precision and Decision Forest Regression has the fastest computing time with the sample data set using the legacy Azure ML systems. Decision Tree Regression model in Spark ML has the highest accuracy with the quickest computing time for the entire data set using the Big Data Spark systems.

Ankita Paul [ Graduate student, Computer Information Systems, California State University, Los Angeles, USA ]
Shuvadeep Kundu [ Graduate student, Computer Information Systems, California State University, Los Angeles, USA ]
Jongwook Woo [ Professor, CIS Department, California State University, Los Angeles, USA ] Corresponding Author

자료제공 : 네이버학술정보

Earticle