7 . 12 . 15

Expanding Personas – Shared Content & Topic Extraction

Personas are a fundamental part of UXD (user experience design). Using a combination of qualitative and quantitative data these personas represent the main user groups of your website. Research will help inform web design and functionality. You’ll be designing for your target audience which will give you a head start in meeting those business objectives.


A recent blog post on Moz recently caught my eye (to be fair anything that covers automation / APIs usually does!). It contains a script which uses a number of data sources to gather commonly shared content and topic extraction for an inputted list of Twitter usernames. This will potentially give you additional insight into your target audience:

  • What content did they consume?
  • Is there a common theme which links consumed content?
  • Where do they read it?

You’ll need some technical knowledge to get this up and running but once you do you’ll be able to get some audience insight in the matter of minutes. I’d highly recommend reading the post on Moz before reading this. Whilst this script in no way replaces the need for thorough persona research it can certainly assist in the content gathering phase.

You’ll notice that the process shared on Moz is for Mac only. Luckily only a few modifications are required to get this up and running on Windows. Let’s get stuck in.

  • Download the script from Github. Click ‘Download Zip’ and unzip the contents of the downloaded file to a folder of your choice.
  • One of the plugins specified in the downloaded requirements.txt file is for Mac only called gnureadline. The Windows alternative is pyreadline. Simply overwrite requirements.txt with this one.
  • Get your Twitter API keys. Go to Twitter apps and create a new app. Enter a name, description and website (this can be a placeholder value as the script won’t be public facing). Callback URL is not required. You’ll need a API KeyAPI SecretAccess Token and Access Token Secret.


  • Get an AlchemyAPI key. Complete this form to get your free key. This is normally sent to your email address shortly after completion.
  • Add your gathered Twitter and AlchemyAPI keys and add them to a new file called within the same folder where you unzipped the downloaded files from Github. The structure of this file should be:
watson_api_key = "INSERT ALCHEMY API KEY HERE"
  • Download Python for Windows (direct download). This is version 2 which is recommend for this.
  • Download Git Bash. Apart from having a funny name this will give you a command line interface that will make you feel you are running the original Jurassic Park (the raptor fences aren’t out are they?).


  • Open Git Bash. We want to change the active directory (folder) to that of your downloaded script and associated files. The command for this is cd <directory>. Example command line: cd “S:\Persona_Tool”. This is where you downloaded / placed the script. Press enter to execute the command. If you’d like to learn more about command lines this cheat sheet is a good place to start.
  • We need to tell Windows where to find Python. Check the location of Python on your machine. Our example uses Python 2.7 which was found the root of the local disk (C:\Python27). Take a note of this and insert it into the following commands (execute one by one):
export PATH="$PATH:/c/Python27"
echo 'export PATH="$PATH:/c/Python27"' > .bashrc
  • Type and execute the following command to run the downloaded ‘’ file. This will install Pip which is a easy way of keeping Python modules up to date.
  • The script has a number of requirements. Execute the following command to read the requirements.txt file and install them automatically.
pip install --r requirements.txt
  • We’re finally getting there! Create a usernames.txt file in the folder with a list of Twitter usernames that you wish to run the analysis for. You may have your own method of collecting relevant usernames but Followerwonk (as suggested by Moz) is a good tool for finding relevant profiles by a bio search. You’ll need one username per row, example below.
  • Start the script by running the following command. You should see a sequence of loading events and it will inform you when the script has finished running. It will create two time stamped CSV files in the script folder. One will be a list of domains with share count and the other will be a list of topics extracted from shared content.



It appears that space missions, politics and business have been the hot topics amongst this list of users recently!

Summary & next steps

You’ll now have persona data to build upon. The hard work has now been done and you’ll be able to run reports again for different usernames just by following the final two steps.

If you followed the process your coding ninja abilities will have increased too.

A web based tool (internal or public) would certainly be more user-friendly and is certainly possible from a technical point of view. The idea of combining this data with other APIs, data sources and visualisation methods to make it a more fully featured persona tool is certainly one which I’ll be pursuing.

Image credit (header): Daniel Burns

Maximize business opportunity with data-driven decision making Contact us here.

Matt is an Analytics Solutions Architect at twentysix with particular expertise within Google Tag Manager, CRO and all things technical such as HTML, CSS, JavaScript, Dojo, jQuery, PHP, Firebase, Angular & Ionic. He also enjoys building web / mobile apps outside of work.

Share This: