Hamburger Music

Hamburger Music is a generative poetry project that began in the summer of 2014.

It’s powered by a python script that scrapes words and phrases from Youtube closed -captions. In turn, the snippets of prose created by this script are used to power my personal twitter account and a poetry blog of the same name.

It was written by Elon Bing, a talented young developer from the Netherlands.

Some Examples

“I am God…
little bit to deal with there.
Those lost memories…
you’re actually at.
Now I am just holding on to it”
“I’ve given you 100% already
I secretly love you 100%
I’ve given you 100% already
Harmful.”

How it Works

When run, the script performs the following:

  1. Selects a random Youtube video
  2. Scrapes one or more lines from the selected video's closed-captions
  3. Places those lines on top of each other
  4. Outputs the resulting combination (or single caption) as a .txt document
  5. Repeats (x) times, if noted

That’s it. After you run it, you'll up with a folder full of .txt documents with which you can do whatever you like. It’s a great way to generate lots of random snippets of writing.

Note that the closed captions fetched from YouTube are only those that were entered in manually (getting automated captions is kind of complicated).

Structure & Variables

"Settings.py" lays out the variables and filters we can use to control the output of this script:

POEM_LENGTH=5
MINSUBLENGTH=3
MAX_LINES_PER_VIDEO=2
SAVE_TO_FILE=True
POEM_BASENAME="Poem-"
POEM_EXTENSION=".txt"
POEM_PATH="/path/to/poetry/dir/"
NUMBER_OF_POEMS=50
WORDRANGE=xrange(3,6)
#WORDRANGE=[1,3,7,8]
#WORDRANGE=[6]
CAPITALIZE=True

POEM_LENGTH
This controls the amount of "lines" that you want in your poem. As such, this also determines how many captions are scraped. So, as an example, setting it to 5 might get you something like:

No? ok… *crying intensifies*
But then again
Are more likely to receive clicks.
Is showing to potential customers.
THIS MORNING.

MINSUBLENGTH

The minimum length in characters for a caption. This was included so that we could avoid single letters or two letter words from being entire lines (though if you're into that kind of thing, just comment it out!)

MAX_LINES_PER_VIDEO
The amount of lines scraped from a single video. Setting this to a higher number gives higher performance but cuts down on the output’s “randomness”.

SAVE_TO_FILE
This is so that the script outputs your poems as .txt documents. Setting this to "False" will print the result directly in the terminal.

NUMBER_OF_POEMS
This setting allows you to repeat the script after it finishes creating its poem.

WORDRANGE
There are a few different styles of this variable:

Note that WORDRANGE will not truncate or modify captions. It simply returns captions that meet that criteria. Also note that only one WORDRANGE variable type can be used at once.

CAPITALIZE
Setting this to "True" will capitalize the first letter of each line. Setting it to "False" will return raw formatting (which may or may not be capitalized).

Filtering Content

Also included is a feature that allows you to white or blacklist certain words or phrases:

WHITELIST=[
'awesome',
'I\'m hungry',
'driveway'
]

#WHITELIST=open('/path/to/whitelist.txt').read().splitlines()

BLACKLIST=[
'New York',
'Los Angeles',
'don\'t mess with me'
]

#BLACKLIST=open('/path/to/blacklist.txt').read().splitlines()

New additions to the whitelist or blacklist should follow the convention seen above. Don't forget to include 'single quotes' for each new word or phrase.

To create an external whitelist or blacklist, create a text document with your words and phrases separated by line break and change path/to/ to point to that file. This can make lists that are long easier to manage.

If you want to disable these lists, you can comment them out or leave them blank.

Further Reading

During this project, I stumbled across an poet by the name of Kenneth Goldsmith. He's an advicate of Plagerism as Content , a hopelessly new media-centric term describing the kind of algorithmic, data-fueled poetry that has been making its rounds during the early part of the Information Era. I encourage anyone who is interested in this kind of work to look him up.

Here's a talk he did a while back entitled "On Uncreative Writing":

I was also heavy into Dada stuff when I made this.