Scrapping the timeline of a Facebook page

Have you ever wandered how to get all the timeline posts for a given Facebook page. And by all - I mean all of them until the very first post that was created about the page. Including all the comments, and replies on the comments. Including all the likes on the posts, comments and replies.. and not just the number of the likes, but even the user who made the like. If you are admin of the page, you can access all the posts, even the ones that are hidden, gated or restricted. But if you are not admin, still you can access all the public ones.

To work with the Facebook Graph API you need an application. With the application you could get a user access token, or just for trying things out - you could get the App ID and App Secret from app dashboard in the FB developers at:
https://developers.facebook.com/apps//dashboard/

Facebook Graph API is a mesh of object with links between them. To access the timeline posts for a page we start from the page itself. Lets take for an example the https://www.facebook.com/Komfo.bg/ - the ID of this page is: 194885970934 or we could also use its user-name komfo.bg.

To get the stream of all page and user posts we first use the /feed endpoint on the page object. I.e.
https://graph.facebook.com/komfo.bg/feed?access_token=...

will return something like:
{
  "data": [
    {
      "message": "Software University проведе двудневна олимпиада по състезателно програмиране, уеб и софтуерни проекти на 9 и 10 януари. В събитието взеха участие около 400 ученици и студенти от страната. Част от журито, което оценяваше уеб и софтуерните проекти на състезатели до 16 годишна възраст, бе Михаил Миков, един от разработчиците в Komfo.",
      "story": "Komfo added 25 new photos to the album: СофтУниада 2016.",
      "created_time": "2016-01-13T09:15:29+0000",
      "id": "194885970934_10153687714240935"
    },
    {
      "message": "Колко бързат хората сутрин, за да дойдат на работа в Komfo навреме... заради закуската :)",
      "story": "Komfo with Damian Bogdanov and 3 others.",
      "created_time": "2016-01-08T10:12:56+0000",
      "id": "194885970934_10153677620250935"
    }
  ]
}


But this are only the top level posts. There is no information about the number of likes, shares and comments on those posts. And then the actual likes and shares have to be done in separate calls. I.e. for the example above - to get the comments on the first post - 194885970934_10153677620250935.

https://graph.facebook.com/194885970934_10153677620250935/comments?access_token=...

{
   "data": [
      {
         "id": "10153677620250935_10153677733565935",
         "from": {
            "name": "Simona Dakova",
            "id": "10209043790078486"
         },
         "message": "..",
         "can_remove": false,
         "created_time": "2016-01-08T11:56:59+0000",
         "like_count": 4,
         "user_likes": false
      },
      {
         "id": "10153677620250935_10153677819570935",
         "from": {
            "name": "Lyubomir Stoyanov",
            "id": "10206989449169190"
         },
         "message": "....",
         "can_remove": false,
         "created_time": "2016-01-08T12:58:27+0000",
         "like_count": 3,
         "user_likes": false
      }
   ]
}

But the Graph API allows for nesting multiple calls from single node using the fields parameter . The general form is like this:
GET graph.facebook.com
  /{node-id}?
    fields=<first-level>{<second-level>}
I.e. here the comments is the first level:
https://graph.facebook.com/komfo.bg/feed?fields=comments

will return the posts and the comments embedded into every post with a single call! To access allso the likes and shares it is as easy as
https://graph.facebook.com/komfo.bg/feed?fields=comments,likes,shares

Now the issues is that the comments have their own comments and likes. But the fields parameter support even deeper nesting. For example to get the posts, its comments, and the likes on those comments
https://graph.facebook.com/komfo.bg/feed?fields=comments{likes},likes,shares

Now to get the replies on the comments, and the likes on those replies:
https://graph.facebook.com/komfo.bg/feed?fields=comments{likes,comments{likes}},likes,shares

In this example
  • the first comments is the first level
  • {likes,comments{likes}} is the second level
    • likes,comments is the second level - and it gives access to the likes and comment to the first level comments
      • {likes} is the 3rd level, and it gives access to the likes on the second level comments.

Final complete example that fetches

  • all posts with its comments, likes and shares
  • all its second level comments and likes 
  • all its third level likes

komfo.bg/feed?fields=message,story,description,created_time,from,
    likes.summary(true).limit(100).order(reverse_chronological){name},
    comments.summary(true).order(reverse_chronological).limit(100)
        {
            from,message,
            likes.summary(true).limit(100)
                        .filter(stream).order(reverse_chronological){name},
            comments.summary(true).order(reverse_chronological).limit(100)
            {
                from,message,
                likes.summary(true).limit(100)
                        .filter(stream).order(reverse_chronological){name}
            }
        },
    shares


With this filters you can get up to 100 items on every level. Then you need to make paging to get the whole list.

Comments

Popular posts from this blog

Data types: Backend DB architecture

Node.js: Optimisations on parsing large JSON file

Back to teaching