Developing actionable insights from data starts with gaining an understanding of the business. Tonight we had the pleasure of featuring Young Won, who has several years of experience in the retail space. He currently operates as a data scientists across various industries. In this blog, we’re revisiting some of the machine learning techniques Young discussed during his talk on Predicting Future Occupancy with Vacation Rentals by Young Won.
One of the most important parts of being a data scientists is having a process for how you go from start to finish on a given project. The main objective of this project was to help the vacation rental company automate decision making for pricing with future bookings. As we develop a better understanding of the business goals, we increase our ability to build a tool that will effectively predict future booking trends for Won’s client. In Won’s presentation, he discusses the CRISP-DM model he uses to plan, design and deploys this project. This CRISP-DM is a six-step process listed below:
- Business Understanding
- Data Understanding
- Data Preparation
Identifying Key Factors
Every business or industry has unique factors that have a major impact on business success. In Young’s project, the clients business is impacted by seasonality changes. Certain times of the year have more business activity compared to others due to a wide array of external factors. Identifying these factors during the first stage of the CRISP-DM approach is key to Won’s success. Won used a technique called 4-4-5 calendar model to be able to factor in seasonality and rolling calendar years. This allowed him to be able to capture data consistency across several years.
After developing a strategy for interpreting the trends, the next steps focused on identifying data sources and extracting the data. This process will also include EDA(exploratory data analysis) and data cleaning. Cleaning and preparing the data is a critical process that precedes implementing machine learning algorithms for predictions.
Machine Learning Techniques Used
Won is ready to build the machine learning algorithms once he’s effectively identified the business problems, extracted and clean the data. The three types of Machine Learning classifiers used were Logistic Regression, Random Forest, and Gradient Boost.
The goal of trying multiple algorithms is to determine which one most effectively makes the prediction. In the final stages of determining the best algorithm, Won uses multiple charts to observe the algorithms against each other. We listed the ROC chart he uses to compare all three classifiers in the graph on the right. He used additional graphs to gain a better visual of the performance differences between the three models used. Once he identifies the most effective predictive tool, the next step is implementing the tool into the solution.
Pulling Everything together
After all the analysis has been completed the last steps is to create a product the client can use. The finished product is a tool that predicts future occupancy rate for the vacation rental company. Now the vacation rental company has been empowered to make fast and effective pricing decision backed by the power of machine learning. One of the benefits of implementing a predictive analytics tool is consistency for making pricing decisions.
It was a pleasure to host Young Won and learn more about his approach to solving complex problems using Machine Learning. Young Won has extensive experience handling data within the retail industry. If you have any questions about his presentation or techniques he used, please feel free to message him on Linkedin. Also, Young Won contact information is included in the last slide of the presentation slide deck.
Alex Brooks is the founder and CEO of AE Brooks, LLC (d/b/a, Entreprov), a Seattle-based firm that builds custom predictive analytics and automation tools to enhance a company’s performance and decision making.