PSN Trophy Scraper to Scrape Games and Trophy Information

This scraper will allow you to scrape game and trophy data from the official PlayStation website. The scraper uses PhantomJS to scrape the data based on a provided user profile data. The data can then be parsed by a programming language and inserted into a database.

This is still an early version of the code. I have it finally producing an output so Im going to post this up now and update it with more code next time i am able to work on it.

There is no official PSN API so we will have to take advantage of the data avaiable on the public trophies DB on the playstation website. There are websites that have trophy list information for Playstation trophies, so I wanted to find a way to do the same. This website uses a ton of Javascript and AJAX meaning we cant easly scrape the data using PHP or any other language. To make life even harder, they seem to have a lot of safeguards in place to prevent people from getting access to this trophy data easily. I’m not sure why they have to be so secretive with this stuff!

Anyway, I was able to use PhantomJS to bypass any of the difficult stuff in place to block users from doing this and I was able to successfully dump data from a users trophy list into a text file. The hardest part is done now. Once we have access to the list its just a matter of gathering the data you want. It gets returned in HTML format which makes it nice and easy to parse.

If anyone has any suggestions or improvements to this please post them. As a group we may get a fully functioning scraper working!

This is my first time using PhantomJS so I’m still getting the hang of it. For anyone who doesnt know what it is, PhantomJS is a way to interact with a webpage in the way that a user would. You provide a URL and give commands. PhantomJS will then be able to click buttons and interact with the web page just like a user would. We can return the content of the current page at any time which allows us to pull the trophy data, or anything else for that matter.

For this i will use Hakoom as the username since he has the most trophies of any user on PSN. If you visit the website in person you will see that the page only loads a certain number of trophies and then there is an AJAX button at the bottom of the page to load more content. This is the next thing that I will add to this document in order for us to be able to scrape a huge amount of trophies at once. For now the code is able to get the first page of trophies, which I think is a good start (considering it took me ages to get working 😛 ).

In order to run this script you will need to have PhantomJS installed. This is a command line tool, but it’s not as difficult to use as it might seem. If you save the code below to a file you can run it using the following command.

phantomjs psn.js


Waiting for trophy list to load...
'waitFor()' finished in 1270ms.

The window will then dump a huge load of HTML that contains the trophy data. The important part looks like this.

<h2 class="clearfix title">The Swapper</h2></div><ul class="trophies clearfix"><li class="bronze">0</li><li class="silver">0</li><li class="gold">0</li><li class="platinum">0</li></ul>

The best way to handle this data is to use a programming language to parse the HTML and pull the data we need from the code. There are many languages you can use to do this. Once of the most simple ways to do this in my opinion is to use PHP. The exec() function will allow you to run the above command and all of the output will be dumped into a variable which you can then parse. You will need to update the path for psn.js if you do not have the php file in the same folder as the psn.js file. So the function might look like.

$trophyOutput = exec("phantomjs /var/www/psnscrapper/psn.js");

File Contents : psn.js

var page = require('webpage').create();

//open the url of the playstation trophy site.'', function(status) 
  page.evaluate(function() {
    document.getElementById("trophiesId").value = "hakoom";
    //checkPTrophies(); btn click calls this function
    $('#btn_publictrophy').click().delay( 6000 );

  //generally this completes in about 300-500ms.
  console.log("\nWaiting for trophy list to load...");

      return page.evaluate(function(){
        //this div contains all of the trophy content. Once this is present then we know that the page has successfully loaded and we are now able to pull the trophy data. 
        //This is the most difficult part of using this tool. If you try calling values that arent loaded yet it can mess things up. 
        var e = document.querySelector("#trophyTrophyList .trophy-image");
        return e;
    }, function(){
        var trophiesDiv = page.evaluate(function(){
          //dump all of the trophy list innerHTML data. 
          return document.getElementById("trophyTrophyList").innerHTML;
      }, 1000); // wait a little longer
    }, 20000);


//thanks to Artjom B for helping with this part.
function waitFor(testFx, onReady, timeOutMillis) {
    var maxtimeOutMillis = timeOutMillis ? timeOutMillis : 3000, //< Default Max Timout is 3s
        start = new Date().getTime(),
        condition = false,
        interval = setInterval(function() {
            if ( (new Date().getTime() - start < maxtimeOutMillis) && !condition ) {
                // If not time-out yet and condition not yet fulfilled
                condition = (typeof(testFx) === "string" ? eval(testFx) : testFx()); //< defensive code
            } else {
                if(!condition) {
                    // If condition still not fulfilled (timeout but condition is 'false')
                    console.log("'waitFor()' timeout");
                } else {
                    // Condition fulfilled (timeout and/or condition is 'true')
                    console.log("'waitFor()' finished in " + (new Date().getTime() - start) + "ms.");
                    typeof(onReady) === "string" ? eval(onReady) : onReady(); //< Do what it's supposed to do once the condition is fulfilled
                    clearInterval(interval); //< Stop this interval
        }, 250); //< repeat check every 250ms

Related Articles

Related Questions

Function Keys Reversing Between Fn Actions And Normal

My keyboard has the usual F1 to F12 keys along the top. I use these for shortcuts in various applications. These keys also have...

Whirlpool Oven F6E6: Appliance Manager 1 Board Communication

I have a brand new Whirlpool oven W11I OM1 4MS2 H or (859991549450). I bought it alongside the microwave combi oven. I have had...

Whats the difference between the Tapo P100 and the P105?

There are a few different Tapo smart plugs. The P100 and P110 differ based on the smart power monitoring feature but where does the...



Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Tools

Memory Converter

Converting values between various metric measurements is usually quite simple as there will be 1000 of the smaller unit in the next larger unit....

Bitrate Converter

Below you will find a bitrate converter. This tool will allow you to enter a bitrate value, in one of many different formats and...

Aesthetic Text Generator

There are various ways to make your social media profile seem more unique, some of which are not as easy to implement as others....

Aspect Ratio Calculator For Images

Aspect ratio is the ratio between the height and width of an image. If you want to resize an image by 100 pixels, you...

Add Text To Image

Use this free tool to add text to an image. Simply select the image file that you want to overlay text onto and you...

JavaScript Multi-line String Builder

Javascript did not always support multi-line strings. If you attempted to create a string variable using quotes, putting a line break into the source...

Latest Posts

Latest Questions