Google wants to rank internet pages based on the quality of the facts they contain.
A new paper published by a group of software engineers suggests that the internet search giant may be preparing to change the algorithms it uses to scour the web.
Currently searches appear according to a complex combination of key words and links with other websites, but this fails to weed out inaccurate information.
Instead a Google research team has developed a way of measuring the trustworthiness of the information contained on an internet page.
If implemented it could mean that currently popular sources that regularly get facts wrong could fall foul of the new search technique.
GOOGLE PAY IS ON ITS WAY
Not one to be outdone by its rivals, Google is reportedly working on a mobile payment service called Android Pay.
The firm is expected to officially announce the service at its developer conference later this year.
It will create a way for companies to accept transactions through their apps without having to introduce their own individual payment services.
Android Pay users will then be able to upload credit card or debit card information to a single secure location but use it to pay for items across apps.
And customers will be able to use it to pay for in-app purchases, goods or services with a single tap.
Google is also expected to allow companies to use its Android Pay API to enable tap-to-pay options in physical stores using NFC readers, for example.
In a paper to be published in the Proceedings of the Very Large Database Endowment, the Google researchers said webpages would be allocated trustworthiness scores.
They said: 'Quality assessment for web sources is of tremendous importance in web search.
'It has been traditionally evaluated using exogenous signals such as hyperlinks and browsing history.
'For example, the gossip websites listed (see image below) mostly have high PageRank scores, but would not generally be considered reliable.
'Conversly, some less popular websites nevertheless have very accurate information.'
'We address the fundamental question of estimating how trustworthy a given web source is.
'Informally, we define the trustworthiness or accuracy of a web source as the probability that it contains the correct value for a fact (such as Barack Obama's nationality), assuming that it mentions any value for that fact.'
Currently web searches are ranked by, among other things, the number of incoming links to a page to help Google's search bots determine the quality of the link.
This, however, is really only a measure of the popularity of a webpage rather than the accuracy of the information it contains.
Webpages containing inaccurate information can be widely shared and linked to by blogs and other external sites, causing them to feature high up in Google search results.
However, Xin Luna Dong, Wei Zhang and colleagues at Google have designed a way of automatically extracting information from webpages and ranking them for trustworthiness.
They use a system they have called Knowledge-Based Trust, which pulls facts from many pages and then jointly estimates the correctness and accuracy of these.
It then counts the number of incorrect facts on a page to give it a trust score.
To help the software the team has developed draws on Google's fast Knowledge Vault - a store of facts that have been pulled off the internet and are unanimously agreed on as being true.
The software extracts what are known as 'knowledge triplets' - such as Barack Obama, nationality, USA.
Websites that contain contradictory information are then moved down the rankings on internet searches.
In a test of their system, the researchers applied their model to 2.8 billion pages on the internet and were able to reliably predict the trustworthiness of 119 million pages and 5.6 million websites.
Giving an example of the nationality of US President Barack Obama, they say their system can distinguish between sources that are correct, those that are wrong and those read incorrectly by information extraction software.
When the team applied their algorithm to a list of top 15 gossip websites including Yahoo! OMG!, TMZ, E Online, and Gawker they found that they ranked in the bottom 50 per cent of searches.
The researchers said: 'In other words, they are considered less trustworthy than half of the websites.'
However, under the current search system used by Google, they rank among the top 15 per cent.
In another example the paper describes how forums are also another source of poor information.
They said: 'For instance, we discovered that answers.yahoo.com says that 'Catherine Zeta-Jones is from New Zealand', although she was born in Wales according to Wikipedia.'
A spokesman for Google said there were no specific plans to use the algorithm in their public search engine yet.
He said: 'This was research - we don’t have any specific plans to implement it in our products. We publish hundreds of research papers every year.'