Lyrics Algorithm

Let's write some lyrics!!!

A melody has been written by the crowd, one note at a time by allowing anonymous internet users to vote on one of 8 (and later 11) pitch options.

Crowd-sourcing lyrics for this song presents some further challenges:

  • There are hundreds of potential words which can be chosen from the dictionary (rather than 11 possible choices when writing each note of the melody).
  • Lyrics could conceivably become controversial (unlike a melody). We have to do our best to prevent generated words from causing offence to anybody.
  • This project may be more prone to voting fraud / manipulation (disrupting lyrics is more exciting than disrupting a melody).

I have attempted to address each of these as follows:

Choose from the cloud of words

While it would be ideal to ask the crowd to independently think of a word which they think would fit best...this would require a very large amount of votes before a winner could be chosen (as there are thousands of possible word choices at every position in the lyrics).

A word cloud is used to aid the voter by displaying all the prior choices from the rest of the crowd. These words are arranged randomly for each user to remove any position bias. If a voter believes there is a more suitable word that is not already in the list, they may add their own new word.

To maximise knowledge input from all participants, it is important to consider that there may only be a small window of interest during a visit. Once the number of words in the cloud become overwhelming, at the expense of bias, we need to highlight the trending words. So, if there are more than 20 words in the cloud:

  • the most popular 5 words from all votes, and
  • the most popular 5 words from the latest 25% of the votes
will be displayed with larger text. The opportunity still remains for visitors who are interested in reading every word to hunt for outliers which they find appealing.

When a user adds a word, it is combined real-time to everybody else's word cloud with smaller sized text. This helps promote awareness of new word ideas.

The most popular word is accepted as the official lyric when it satisfies ALL of the following criteria:

  • First submitted more than 12 hours ago (gives enough opportunity for other voters to flag the word in the case where it has broken a rule)
  • Most popular option of all the votes (clear favourite)
  • Most popular within the latest 25% of the votes (ensures that a word does not win just because it has been in the word cloud for the longest time)
  • At least 500 word votes in total (a sufficient sample size obtained in order to increase the likelihood of a quality decision by the crowd)

Keep it on the rails

"One word at a time" games usually do not produce anything serious (when participants are not bound by rules). It is more entertaining to think of the craziest / funniest word to add. However, to write full song lyrics, a more stringent framework is necessary.

To produce acceptable lyrics in this (very unconventional) manner, this project locks voting down to a number of rules (relating to the resulting phrases that form) in an attempt to guide the lyrics into something meaningful. The current rules specify that a new word should not cause a phrase to become:

  • comical
  • controversial
  • political
  • profane
  • silly
  • a breach of copyright

After a vote has been cast for a word, each user will then have have the ability to flag any words from other voters which break these rules.

A word that is in breach of a rule may attract a large proportion of votes from users who wish to disrupt the project. Consequently, it is important to have a facility which gives a relatively small number of voters the power to remove a word that is in breach of the rules. Equally important is to protect against "false negatives" which could potentially remove legitimate word options.

Hence, if a word is downvoted by 5 people, within the latest 100 votes, that word will no longer be a valid word option and will be removed from the cloud.

If a user downvotes words that are clearly NOT in breach of the rules or if a user adds words that are clearly IN BREACH of the rules, they may be blocked from the system and / or their votes invalidated.

Word options are also locked down to a fixed underlying word bank to keep each invidividual word acceptable.

Prior to confirming their vote, users are provided with a preview to experience how their word would look if it were part of the lyrics.

Monitor for fraud

As with the melody writing process, I again have systems in place to manage voter fraud. A few different techniques are used such as monitoring for patterns from general location and time intervals.

With the above framework in place, the system will prevent any words becoming official lyrics if they are in breach of a rule. This should reduce some of the incentive for abuse. However, if abuse begins to distract the experiment, verifications such as captcha or facebook login may be introduced if necessary.

Change Log

2nd Sep 2016: Increased the size of words in cloud for top 5 words as well as top 5 of the last 25% of votes.

22nd Sep 2016: The number of downvotes required for a word to be declared invalid, was previous 3 at any time. This has been modified to 5 within the latest 100 votes, in order to reduce the likelihood of false negatives.

28th Nov 2016: The number of downvotes required for a word to be declared invalid has now been modified to 8 within the latest 100 votes, in order to reduce the likelihood of false negatives.

If this algorithm needs to be adjusted throughout the project, the changes will be noted here.

Contact me at if you have any questions / suggestions. Happy voting!