Researchers develop new software to fix broken links

15 Feb 2014

Thanks to researchers the frustration that all of us experience, of clicking a broken link to an error page, may be a thing of the past, PTI reports.

However, what was more frustrating and held wider implications for science, healthcare, industry and other areas was when machines communicated and expected to find specific resources that turned out to be missing or dislocated from their identifier, the report said.

According to researchers, this could cause problems when a computer was processing large amounts of data in a financial or scientific analysis, for instance. If the resource continued to exist on servers, then it should be retrievable given, a sufficiently effective algorithm could recreate the missing links.

According to computing engineers Mohammad Pourzaferani and Mohammad Ali Nematbakhsh of the University of Isfahan, previous efforts to address the issue of broken links had focused on the destination point.

The approach had two inherent limitations, first it focused on a single point of failure whereas there might be wider issues across a database and secondly, it relied on knowledge of the destination data source.

Pourzaferani said the proposed algorithm used the fact that entities preserved their structure event after movement to another location. The algorithm, therefore, created an exclusive graph structure for each entity, he added.

When the broken link was detected the algorithm started its task to find the new location for detached entity or the best similar candidate for it, he pointed out.

To that end, the crawler controller module searched for the superiors of each entity in the inferior dataset, and vice versa. Following some steps the search space was narrowed and the best candidate was chosen, he added.

The algorithm was tested on two snapshots of DBpedia within which were contained almost 300,000 person entities.

The researchers' algorithm identified almost 5,000 entities that changed between the first and second snapshot recorded some time later.

The algorithm was able to relocate 9 out of 10 of the broken links.

The International Journal Web Engineering and Technology carries the details in a report.