Scaling a popular internet radio station, an interview with Mark, creator of Stereodose

Or: What happened when I hit the front page of reddit...

Mark is the creator of Stereodose, a wildly popular internet radio station with a unique approach to generating playlists: you pick your drug of choice, you pick your mood, and a customised selection of tracks start to play.

Stereodose is hosted by PythonAnywhere, a Python-focused PaaS and browser-based programming environment.

Stereodose.com stats
Users80k
Pageviews500k / month
Unique visitors70k / month

Screenshot of the stereodose front page

Disclaimer: PythonAnywhere LLP naturally does not condone the use of harmful drugs. This is why we do not support PHP. Although we do believe that people should be free to use PHP if they wish to, in the privacy of their own home.

Here's our exclusive interview with Mark:

What's your background? How long have you been programming?

I graduated from college in 2011, with a degree in Economics. I wasn't interested in finance or consulting, which seemed to be what all my classmates were doing, so I started working at a small music marketing company out of college. I interacted with many blogs at the marketing job, and I wanted to create something with more functionality than what I was seeing, especially one that catered to my music tastes.

Around late 2011, I started learning how to program, with my main goal being to build my own Wordpress music blog (learning HTML, JavaScript, and PHP). I enjoyed programming and creating much more than marketing, so I eventually quit my job in 2012 and started learning more/resume building. In total, I've been programming a little over 2 years now.

What first gave you the idea to build a site like this?

I came up with the idea some time in 2011. While coming up with a theme for the a music blog I wanted to create, I realized there weren't any music blogs focused on "drug music." I created the music blog, but it didn't really go anywhere. I decided the "drug music" message needed to be as blunt as possible, so I came up with a "pick your drug, we'll make your playlist" format.

Why did you choose web2py and PythonAnywhere? What do you like about them? What do you hate about them?

At first, I used Wordpress to build the site. It was doable, but I was forcing Wordpress to do something that it wasn't optimized to do. I started looking into other options, realizing a web app framework would be my best bet. After doing lots of research, I came up with Rails, Django, and web2py. Looking at languages, I found PHP to be really ugly, with complex syntax, compared to Ruby and Python.

I researched each framework as thoroughly as I could, although I didn't understand many of the specifics at that point, just the general differences. I remember reading a lot of discussions, and finding Massimo and other core web2py developers actively supporting users. The smaller community and passion for web2py (having the creator write Stackoverflow responses), was very convincing. I saw web2py as a reaction to Rails and Django, taking inspiration from both, but improving on the flaws of each framework.

I don't quite remember where I found PythonAnywhere first, either in the web2py forums or this web development tutorial by Marco Laspe. However, it was definitely the tutorial that convinced me that PythonAnywhere was the way to go, as deploying web2py easily was very important to me (being so new to web development).

I searched through a couple other hosts, but PythonAnywhere was the only host with a special interest in web2py and Python.

There are many things I like about web2py and PythonAnywhere, so I'll just cover the most important points. For PythonAnywhere, the in-browser Bash Console, Scheduler, and amazing customer support are the big winners. There is an undeniably human aspect to PythonAnywhere, with very real interactions and help when you need it. I really feel that I'm working with people who care about innovation and programming, not just an anonymous corporation.

For web2py, the dedication to creating a quality product from the web2py community is my favorite thing about using web2py. I have many specific needs for my site, and so far, web2py has provided the tools to build a wide array of functionality. It always surprises me how vast web2py's tool set is, and the fact that many of the features have come from web2py community suggestions.

As for things I hate, I'm going to be very careful with what I say hahaha. Most of my frustrations come from the fact that I'm learning as I'm maintaining/creating for Stereodose; I'm doing many things for the first time. I can't say that I really hate anything about web2py or PythonAnywhere. I'm not at a level of competence to where I can accurately criticize either (sorry for the safe answer).

[Ed: if you want to check out web2py on PythonAnywhere, check out our try-web2py demo, no installation or signup required]

When did you first realise the site was getting a serious amount of hits? Where do you think they came from?

My first breakthrough was after I posted the site on /r/drugs (the drug subforum of Reddit). After that, people started periodically posting the site on other forums as well, with a fair number from Reddit too.

The biggest break was earlier this year, when the site got put on rotation with Stumbleupon, and at the same time it front-paged Reddit (through r/Music). I never spent much time marketing the site myself, and was planning on leaving it be, before this traffic surge. I liked the idea of growing Stereodose and wanted to see its full potential, so I abandoned a resume builder project I was working on to focus on Stereodose.

What kind of traffic are you seeing now?

In the most recent month:

  • Total number of users: 80k
  • Pageviews/month: 500k (website only)
  • Unique visitors/month: 70k (website only)
  • 600k songs favorited/liked by users.
  • 14k playlists played daily.
  • Anywhere from 100k - 200k songs played daily (very rough estimate, multiplying playlists by number of songs per playlist, which depends on how long the listener listens for).

What did you learn from having to scale the site so quickly?

One interesting problem was getting random records from the database, efficiently -- the playlist needs to be fresh every time you load it. I originally used web2py's DAL , using orderby="<random>" to get random records from the database. Unfortunately, this would do a full table scan of the entire song data table. This is expected behavior from orderby="<random>", since it represents mysql's ORDER BY RAND(). As the song table grew larger, this presented more and more of a problem, contributing to many 502 errors and slow page loads.

Thinking about a solution, when obtaining random records I wanted to make sure that the number of records analyzed by mysql was equal to the number of records I actually needed. If I wanted 30 songs, I didn't want to analyze 12000 records with a full table scan, but only analyze those 30 records.

The main solution I found was to generate 30 numbers randomly, and then find the record ids that matched those numbers. Frustratingly, this only works if your record ids are sequential, without any holes. Using a WHERE clause in the database query naturally creates holes in ids of your dataset, since the results that match a WHERE clause are not sequential. I needed the where clause, because the songs are chosen at random ONLY if they match the genre that the user selected.

Thus, my solution was to add a column which contained a sequential number for each song, per genre (row_id_per_genre). If I wanted to select 30 songs at random from Genre A, I would find the number of records in Genre A (stored in a summary table so I don't have to count each time), create 30 unique random numbers from 1 to (# of records in Genre A), and directly find those records in the database. The row_id_per_genre column eliminates holes I would usually get from using the WHERE clause.

The result is able to use index of the row_id_per_genre to find random records directly, and avoids a full table scan, resulting in much faster database selects. A drop from about 300-500ms to 15ms. Other playlist related pages also had this problem, and I used the basic principle from this solution to speed up other random selects.

I didn't realize how unscalable ORDER BY RAND() was, because I never had a big enough table with decent traffic to cause any problems.

Do your boss / college tutor / parents / children/ significant other know about the site?

Most people who know me, also know that I run Stereodose. My family is somewhat iffy about the site; they come from a culture where drugs are very taboo, so it's understandable that they'd have some reservations about Stereodose.

Do you have any advice for other aspiring web developers?

It's not impossible to learn on your own, as long as you have an internet connection and can use Google search! Might seem daunting at first, but give it time and you'll be surprised at how much you can understand.

Do you have any ideas for what to do next with the site? What about other projects?

I have a couple big features coming up for the site, the main one being user-created playlists (and with that, more drugs and moods). There are a couple project ideas rattling around in my head, but I'm not ready to materialize any of them yet.

I'm also looking to see if there are other companies or projects I can contribute to once Stereodose can run on it's own. Some professional experience working with other programmers would be a good change of scene (as well as some pocket change).

Final question: what do you usually pick as your playlist?

Oooo that's a tough one...I love to dance and get down with a funky vibe, so my top choice would be Weed -> Groovin. But I'm always changing it up!


Editor's note, aka the shameless marketing bit: Stereodose started out on our $12/month Web Developer plan, which we market as "able to handle the hitting the front page of Hacker News now and again". As Mark's requirements grew, we started scoping out a higher-end $99 Startup Plan, although even he doesn't need quite that much! We've put together a custom package for him; if you're interested in hosting with us at a custom price point, do get in touch and tell us about your requirements. Don't be scared! We have no salespeople, just friendly devs :)

Got a PythonAnywhere story you'd like to share next month?

Drop us a line at support@pythonanywhere.com and we'll have a chat!

blog comments powered by Disqus

PythonAnywhere is a Python development and hosting environment that displays in your web browser and runs on our servers. They're already set up with everything you need. It's easy to use, fast, and powerful. There's even a useful free plan.

You can sign up here.