We Like Big Data and We Cannot Lie!

Written by: Brett Hull

(Lead Systems Engineer)

Employee Voices


Disclaimer: Some Tech Talk Below!

“You’re still using Friendster?  You HAVE to get on MySpace”.  This phrase was probably last uttered back when Apple only sold iPods.  As technology evolves, consumers are presented with more responsive, interactive, and intuitive products and services. One result of these emerging technologies is that consumers grow accustomed to a certain level of service, often ditching inferior products in a heartbeat.  They expect faster apps, nearly instant page loads, and the ability to view content on every device they own.  This new status quo drives newer technology that allows companies to scale up and accommodate those needs.  Scaling up normally involves a business adding more resources (servers, databases, personnel) to handle the amount of traffic, interactions, etc... on their platform.  At Telescope, big data and the rising trend of campaigns with much larger volumes of data is our day to day reality running the largest campaigns across the biggest networks, brands, and leagues!

First, it might be beneficial to examine why we’re seeing so much more volume than in the past.

If you rewind the clocks 10 years, you would find that Telescope’s voting campaigns were much simpler and certainly more limited. For many of the early vote shows, viewers were limited exclusively to SMS and IVR (call-in) voting.  Fast forward to today, and that very same campaign might allow a user to vote across a multitude of methods including SMS , Facebook , Twitter , and online and each method can increase votes exponentially by allowing multiple votes per day!

Accessibility and increased vote limits are one contributing factor, but I’d be remiss to mention another key reason why our vote volume has increased so drastically over the last decade. The rise of the fan army.

What exactly is a fan army?  A fan army consists of thousands, millions, sometimes BILLIONS (OK, maybe not billions) of fans doing anything they can to support their artist.  It’s basically a fan club on steroids.  The concept of a fan army is still an emerging trend, and currently, they are most prominently seen with KPOP bands, but new fan armies seem to be mobilizing every day.

Why have fan armies made us rethink how we handle our data?  Well, let’s take a look at BTS, one of the most popular KPOP bands in the world. At the time this article was being written, BTS had 21.2 Million twitter followers with most of their tweets getting between 300k-1 million retweets and likes.

BTS has been featured as a contestant in several Telescope vote campaigns and some of these campaigns allow you to vote by simply retweeting. This means it would only take one member of BTS to tweet a vote for themselves and then the fan army immediately swings into action retweeting and thus voting as much as they can.  Imagine Thanos snapping the infinity gauntlet and instead of wiping out half the universe it just made half the universe vote for a boy band.  (Hey, I think that’s a pretty good comparison).  That’s a LOT of votes.

As a result, Telescope had to evolve the way it handles big data and the sheer vote volume it receives especially when it comes to votes that are worldwide.  Large volumes in the past may have meant 50 million to 100 million votes, but now we run campaigns that have exceeded over 1 billion votes… thanks, Thanos. It’s often hard to discuss trends in big data without talking about the accompanying technologies that allow companies to support it.  Distributed systems such as Hadoop and HBase are a cornerstone of big data technologies.  They allow data to be redundant across multiple servers in order to be resilient and performant.  Many companies support clusters that support up to 100s of Petabytes of data with relative ease.

For many years Telescope used Hadoop and HBase very successfully, however, there were several setbacks with our implementation.  Scaling up and rebalancing data often was a manual task and if ever there was an unforeseen surge of volume, we would be required to scale up very quickly and for an extended period of time.  This, in turn, brings us to the primary concern with our previous system.  Efficiency. Having a fully distributed, redundant HBase cluster that is scaled up to handle a very high volume of votes can get cumbersome and pricey.  This is especially true when it is hosted on a Cloud provider where resources can become cost-prohibitive.  Luckily, new technologies have provided a more robust and secure solution and Telescope now utilizes Spark, which allows us to securely store our data in a much more efficient manner and scale up and down much more rapidly should the vote volume require it.

Of course, technology will continue to evolve and embracing antiquated architectures rarely, if ever, yields desirable results.  The way I see it, as long as people stay passionate about supporting their favorite boy band (*NSYNC, anyone?), we should stay passionate about keeping up with their needs.