AirBnB Matches Apartments, Castles And Igloos With Guests Using Big Data
AirBnB is a short-term rental website, operational since 2008, that offers accommodations in 192 countries and has processed over 10 million nights booked. Visitors can book apartments, private rooms, boats, castles, tree houses, igloos, private islands and other properties in 34.000 cities around the world. There are over 500.000 listings, of which 600 castles. All these listings, visitors and bookings generate a lot of data. Their website sees about 10 million web requests a days, including 1 million search queries. Every day, 20 terabytes of new data is created and they have approximately 1.4 petabytes of archived data. All this data is used to offer better personalization, to better match the hosts with guests and to create the best customer experience.
Data analytics is extremely important for AirBnB, as AirBnB Vice President of Engineering Mike Curtis told GigaOM: “We want to apply data to every decision. We want to be a very data-driven company.” One of the challenges that AirBnB faces according to Curtis is to create the best way for personalized search. Offering standard search results created around community-wide rankings or based on geographical data, is not sufficient for AirBnB. For AirBnB, personalized search means incorporating users’ preferences, rental history, social connections, reviews and any other relevant data source. This makes the process a lot more difficult.
AirBnB is performing analyses on their data directly from the beginning. As Riley Newman, Head of Analytics at AirBnB, told Chart.io in a blog post, one of the earliest records was a regression analysis on which features of a listing had the most impact on a booking. They found out it was the quality of the visuals, so once they started offering free professional photography to hosts, the “results were astounding”.
AirBnB also uses predictive modelling to know how the different markets on AirBnB are going to perform in order to prioritize the resources. This requires market-specific forecasts with many different variables but not so much data, which is a lot more difficult. Currently AirBnB has a dedicated team working on forecasting and reporting to optimize the predictive models. In addition, AirBnB mines the data to help hosts determine the best-possible rates for their rentals.
Over 50 engineers work for AirBnB to optimize the platform using different Big Data technologies. They use tools such as Hadoop, Storm (to develop real-time analytics and online machine learning algorithms), Hive, Pig, Spark and Cascading to crunch all the data and develop compelling solutions. They run their analytics in the public cloud, using Amazon, and among others store their log data on S3. This log data is among others used to evaluate A/B testings that are done constantly to optimize the website and Apps. In addition, AirBnB uses the open source cluster-management project Mesos. This operating system allows AirBnB to run different types of computing frameworks on a single set of resources and it provides them with efficient resource isolation across distributed platforms. They are also experimenting with Amazon’s Redshift, to speed-up the query processes from hours to minutes.
For any pure-online organisation, data and analytics are vital. With a global community of travellers and thousands of authentic and diverse accommodations across the globe this becomes even more important if you want to be able to offer the best services for your customers. AirBnB has developed the right, flexible and scalable big data tools to continue their growth and to continue to develop new big data solutions.