Earticle

Iowa Liquor Sales Data Predictive Analysis Using Spark

원문정보

초록

영어
The paper aims to analyze and predict sales of liquor in the state of Iowa by applying machine learning algorithms to models built for prediction. We have taken recourse of Azure ML and Spark ML for our predictive analysis, which is legacy machine learning (ML) systems and Big Data ML, respectively. We have worked on the Iowa liquor sales dataset comprising of records from 2012 to 2019 in 24 columns and approximately 1.8 million rows. We have concluded by comparing the models with different algorithms applied and their accuracy in predicting the sales using both Azure ML and Spark ML. We find that the Linear Regression model has the highest precision and Decision Forest Regression has the fastest computing time with the sample data set using the legacy Azure ML systems. Decision Tree Regression model in Spark ML has the highest accuracy with the quickest computing time for the entire data set using the Big Data Spark systems.

목차

ABSTRACT
Ⅰ. Introduction
Ⅱ. Related Work
Ⅲ. Machine Learning Algorithms
Ⅳ. Our Work
4.1. Azure ML
4.2. SparkML
Ⅴ. Experimental Results
Ⅵ. Conclusion

 Hardware Specifications

저자

  • Ankita Paul [ Graduate student, Computer Information Systems, California State University, Los Angeles, USA ]
  • Shuvadeep Kundu [ Graduate student, Computer Information Systems, California State University, Los Angeles, USA ]
  • Jongwook Woo [ Professor, CIS Department, California State University, Los Angeles, USA ] Corresponding Author

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      Asia Pacific Journal of Information Systems
    • 간기
      계간
    • pISSN
      2288-5404
    • eISSN
      2288-6818
    • 수록기간
      1990~2026
    • 등재여부
      KCI 등재,SCOPUS
    • 십진분류
      KDC 325 DDC 658