Detecting fake news, at its source

04 Oct 2018

Lately the fact-checking world has been in a bit of a crisis. Sites like Politifact and Snopes have traditionally focused on specific claims, which is admirable but tedious — by the time they've gotten through verifying or debunking a fact, there's a good chance it's already travelled across the globe and back again.

Social media companies have also had mixed results limiting the spread of propaganda and misinformation — Facebook plans to have 20,000 human moderators by the end of the year, and is spending many millions developing its own fake-news-detecting algorithms.

Researchers from MIT's Computer Science and Artificial Intelligence Lab (CSAIL) and the Qatar Computing Research Institute (QCRI) believe that the best approach is to focus not on the factuality of individual claims, but on the news sources themselves. Using this tack, they've demonstrated a new system that uses machine learning to determine if a source is accurate or politically biased.

"If a website has published fake news before, there's a good chance they'll do it again," says postdoctoral associate Ramy Baly, lead author on a new paper about the system. "By automatically scraping data about these sites, the hope is that our system can help figure out which ones are likely to do it in the first place."

Baly says the system needs only about 150 articles to reliably detect if a news source can be trusted — meaning that an approach like theirs could be used to help stamp out fake-news outlets before the stories spread too widely.

The system is a collaboration between computer scientists at MIT CSAIL and QCRI, which is part of the Hamad Bin Khalifa University in Qatar. Researchers first took data from Media Bias/Fact Check (MBFC), a website with human fact-checkers who analyse the accuracy and biases of more than 2,000 news sites, from MSNBC and Fox News to low-traffic content farms.

They then fed that data to a machine learning algorithm called a Support Vector Machine (SVM) classifier, and programmed it to classify news sites the same way as MBFC. When given a new news outlet, the system was then 65 per cent accurate at detecting whether it has a high, low or medium level of "factuality," and roughly 70 per cent accurate at detecting if it is left-leaning, right-leaning or moderate.

The team determined that the most reliable ways to detect both fake news and biased reporting were to look at the common linguistic features across the source's stories, including sentiment, complexity and structure.

For example, fake-news outlets were found to be more likely to use language that is hyperbolic, subjective, and emotional. In terms of bias, left-leaning outlets were more likely to have language that related to concepts of harm/care and fairness/reciprocity, compared to other qualities such as loyalty, authority and sanctity. (These qualities represent the 5 "moral foundations," a popular theory in social psychology.)

Co-author Preslav Nakov says that the system also found correlations with an outlet's Wikipedia page, which it assessed for general length — longer is more credible — as well as target words like "extreme" or "conspiracy theory." It even found correlations with the text structure of a source's URLs — those that had lots of special characters and complicated sub-directories, for example, were associated with less reliable sources.

"Since it is much easier to obtain ground truth on sources [than on articles], this method is able to provide direct and accurate predictions regarding the type of content distributed by these sources," says Sibel Adali, a professor of computer science at Rensselaer Polytechnic Institute who was not involved in the project.

Nakov is quick to caution that the system is still a work-in-progress, and that, even with improvements in accuracy, it would work best in conjunction with traditional fact-checkers.

"If outlets report differently on a particular topic, a site like Politifact could instantly look at our 'fake news' scores for those outlets to determine how much validity to give to different perspectives," says Nakov, a senior scientist at QCRI.

Baly and Nakov co-wrote the new paper with MIT senior research scientist James Glass alongside master's students Dimitar Alexandrov and Georgi Karadzhov of Sofia University. The team will present the work later this month at the 2018 Empirical Methods in Natural Language Processing (EMNLP) conference in Brussels, Belgium.

The researchers also created a new open-source dataset of more than 1,000 news sources, annotated with factuality and bias scores — the world's largest database of its kind. As next steps, the team will be exploring whether the English-trained system can be adapted to other languages, as well as to go beyond the traditional left / right bias to explore region-specific biases (like the Muslim World's division between religious and secular).

"This direction of research can shed light on what untrustworthy websites look like and the kind of content they tend to share, which would be very useful for both web designers and the wider public," says Andreas Vlachos, a senior lecturer at the University of Cambridge who was not involved in the project.

Nakov says that QCRI also has plans to roll out an app that helps users step out of their political bubbles, responding to specific news items by offering users a collection of articles that span the political spectrum.

"It's interesting to think about new ways to present the news to people," says Nakov. "Tools like this could help people give a bit more thought to issues and explore other perspectives that they might not have otherwise considered."

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

By Cygnus | 06 Feb 2026

Intel and AMD server CPU shortages are hitting China as AI data center demand surges, pushing lead times to six months and driving prices higher.

Budget 2026-27 Seeks Fiscal Balance Amid Rupee Volatility and Industrial Stagnation

By Cygnus | 02 Feb 2026

India's Budget 2026-27 targets fiscal discipline with record capex as markets tumble, the rupee weakens and manufacturing struggles to regain momentum.

The Thirsty Cloud: Why 2026 Is the Year AI Bottlenecks Shift From Chips to Water

By Axel Miller | 28 Jan 2026

As AI server density surges in 2026, data centers face a new bottleneck deeper than chips — the massive water demand required for cooling next-generation infrastructure.

The New Airspace Economy: How Geopolitics Is Rewriting Aviation Costs in 2026

By Axel Miller | 22 Jan 2026

Airspace bans, sanctions and corridor risk are forcing airlines into costly detours in 2026, raising fuel burn, reducing aircraft utilisation and pushing airfares higher worldwide.

India’s Data Center Arms Race: The Battle for Power, Cooling, and AI Real Estate

By Cygnus | 22 Jan 2026

India’s data centre boom is turning into an AI arms race where power contracts, liquid cooling and fast commissioning decide the winners across Mumbai, Chennai and Hyderabad.

India’s Oil Balancing Act: Refiners Rebuild Middle East Supply Lines as Russia Flows Disrupt

By Axel Miller | 21 Jan 2026

India’s refiners are rebalancing crude sourcing as Russian imports fell to a two-year low in December 2025, lifting OPEC’s share and raising geopolitical risk concerns.

Arctic Fever: How ‘Greenland Tariff’ Politics Sparked a Global Flight to Safety

By Axel Miller | 20 Jan 2026

Greenland-linked tariff threats have injected fresh uncertainty into transatlantic trade, triggering a risk-off shift in markets and reshaping global supply chain planning.

The New Oil (Part 5): Friend-Shoring, Supply Chain Fragmentation and the Cost of Resilience

By Cygnus | 19 Jan 2026

Friend-shoring is reshaping lithium, rare earth and graphite supply chains, creating a resilience premium and new winners and losers in clean tech.

The New Oil (Part 4): Can Technology Break the Dependency?

By Cygnus | 16 Jan 2026

Can magnet recycling and rare-earth-free motors reduce global dependence on strategic minerals? Part 4 explores breakthroughs, limits and timelines.

Detecting fake news, at its source

04 Oct 2018

Latest articles

Global Chip Sales Expected to Hit $1 Trillion This Year, Industry Group Says

Citi to Match Government Seed Funding for Children’s ‘Trump Accounts’

Huawei-Backed Aito Partners With UAE Dealer to Enter Middle East Market

AI is No Bubble: Nvidia Supplier Wistron Sees Order Surge Through 2027

Tech Selloff Weighs on Asian Markets; Indonesia Slides After Moody’s Outlook Cut

Amazon Plans $200 Billion AI Spending Surge; Shares Slide on Investor Jitters

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

OpenAI launches ‘Frontier’ AI agent platform in enterprise push

Toyota set for third straight quarterly profit drop as costs and tariffs weigh

Featured articles

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

By Cygnus | 06 Feb 2026

Budget 2026-27 Seeks Fiscal Balance Amid Rupee Volatility and Industrial Stagnation

By Cygnus | 02 Feb 2026

The Thirsty Cloud: Why 2026 Is the Year AI Bottlenecks Shift From Chips to Water

By Axel Miller | 28 Jan 2026

The New Airspace Economy: How Geopolitics Is Rewriting Aviation Costs in 2026

By Axel Miller | 22 Jan 2026

India’s Data Center Arms Race: The Battle for Power, Cooling, and AI Real Estate

By Cygnus | 22 Jan 2026

India’s Oil Balancing Act: Refiners Rebuild Middle East Supply Lines as Russia Flows Disrupt

By Axel Miller | 21 Jan 2026

Arctic Fever: How ‘Greenland Tariff’ Politics Sparked a Global Flight to Safety

By Axel Miller | 20 Jan 2026

The New Oil (Part 5): Friend-Shoring, Supply Chain Fragmentation and the Cost of Resilience

By Cygnus | 19 Jan 2026

The New Oil (Part 4): Can Technology Break the Dependency?

By Cygnus | 16 Jan 2026

Latest articles

Global Chip Sales Expected to Hit $1 Trillion This Year, Industry Group Says

Citi to Match Government Seed Funding for Children’s ‘Trump Accounts’

Huawei-Backed Aito Partners With UAE Dealer to Enter Middle East Market

AI is No Bubble: Nvidia Supplier Wistron Sees Order Surge Through 2027

Tech Selloff Weighs on Asian Markets; Indonesia Slides After Moody’s Outlook Cut

Amazon Plans $200 Billion AI Spending Surge; Shares Slide on Investor Jitters

Server CPU Shortages Grip China as AI Boom Strains Intel and AMD Supply Chains

OpenAI launches ‘Frontier’ AI agent platform in enterprise push

Toyota set for third straight quarterly profit drop as costs and tariffs weigh