Imagine you peeking at a neighbour when he’s making love to his wife. And at the same time, you videotape the process. And then you offer your other neighbour to buy the video or you upload it to YouTube. That’s about what parsing looks like.

But you are peeking at the data on the site. And in the case with social networks, over personal data and “intimate contact” of the network with users.

If services, sites, social networks or programs want to share data, they create an API and open access to it. This is an interface where programs can interact with each other. For example, airline ticket aggregators collect data from sites that sell these tickets. By mutual agreement and benefit.

Social networks also open up some data under their terms of use. But it is also often forbidden to collect data for any purpose, whether for personal use or to analyze and publish the results. These services are used to analyze accounts on Instagram.

The most ardent parser struggle is LinkedIn. It sued the anonymous parsers, accusing them of fraud, abuse, violation of the criminal code, copyright law, trespassing, and even theft.

The parsers have indirect victims, apart from the platforms itself. For example, bloggers or accounts whose subscribers you want to parry.

The network spends money to attract users, to maintain capacity, to pay employees. A blogger spends money on converting users to subscribers. When you “steal” subscriber database, it is not just a set of data, it is money stolen spent for every person in that database.

Parsing violates copyright and related rights. You can’t collect data that is a trade secret. You can’t use the data collected to limit competition. You must not interfere with the work of the site and increase the critical load on it. You must not extract personal information or violate the terms of use of the site.

What then can I do? To collect data from your own website for further use, under an agreement about personal data.

For example, when you want to filter out your audience, upload only those who recently got married, celebrates their birthday this month, published posts with keywords … and used for retargeting. But there’s a Facebook pixel for that.

How do you check if the site allows parsing?

Enter url/robots.txt in your browser line. You will see a file with the permissions and bans for scanning and indexing the site automatically. You can collect data from the site if you get written permission from the site owners.

To some extent, monitoring the media and social networks for mentioning your name or brand name is also parsed. But it is best to collect such data through monitoring services, that have received all necessary permissions from the resources where the information was published.

Leave a Reply

Your email address will not be published. Required fields are marked *