Crawling in the dark
AI crawlers are a nightmare for Wikipedia; Nintendo will face new, stiff competition for the Switch 2.
Hello. Today’s pro tip: If you want a quick and easy way to stop Google from giving you AI summaries in your search results, you can just add a curse word to the end of it.
ARTIFICIAL INTELLIGENCE
AI crawlers are making even the most generous websites miserable
Wikipedia to bots: "Can you please chill for a minute?"
I really hope at least one of you appreciates a Hoobastank reference.
The Wikimedia Foundation — the organization that runs Wikipedia and its associated websites — says it is using 50% more bandwidth for multimedia since January 2024, which it attributes to automated traffic from AI web crawlers.
Background: Web crawlers are bots that visit web pages and browse their content. They’re typically used by search engines to make an index of web pages so they can be retrieved when a user makes a search. But they have also been used by developers to scrape the troves of data needed to train AI models.
Websites have a file called robots.txt that tells crawlers what they can and cannot access. But this is less of a line of code that bots have to follow and more of a polite request — one that some AI companies have been known to ignore.
What’s happening: Wikipedia caches pages that are particularly popular or seeing a spike in traffic due to public interest. The page will be served to users from the data centre closest to them. But web crawlers access everything, including obscure, rarely-visited pages that have to be served from Wikimedia’s core data centre, which uses more resources and is more expensive.
Wikimedia says 63% of its “most expensive” traffic comes from bots.
Wikimedia has been relatively AI-friendly compared to other website publishers, acknowledging that its content makes up a significant part of AI training data. In a way, its mission of making knowledge openly available to all would suggest ChatGPT and the like would be encouraged to spread that knowledge even further.
But Wikimedia content, from the text in articles to the massive cache of photos people make available, is generally licensed under Creative Commons, meaning material is free to share and redistribute — so long as it is properly attributed, which Wikimedia says hasn’t been happening.
In addition to being a license violation, Wikimedia says this makes it harder to attract new users (be they volunteers who maintain pages and provide multimedia, or would-be donors).
Why it matters: An uptick in AI bot traffic typically raises copyright concerns. But even for people that want the stuff on their website to be used by any person or bot that can access it, AI is causing expensive and annoying problems. Some have compared the uptick in automated bot visits to DDoS attacks, where hackers overwhelm a web service and cause it to crash.
As AI companies try to make models capable of coding, open source developers are seeing their projects and code repositories slow down and crash, as well as increasing their bandwidth costs.
In January, Triplegangers — a company that sells pre-made 3D models for use in art and games — had so many visits from OpenAI’s crawlers that its ecommerce site went down.
Why it’s tricky: Some developers have made tools to block crawlers, but those risk blocking legitimate traffic. Plus, blocking a bot is a great way to flag whoever owns it to come up with some tactic that gets around whatever hurdle has been put in front of them.
Don’t block bots — mess with them: Networking and cybersecurity company CloudFlare has developed a new service for customers that wastes an AI crawler's time. When a bot is detected, CloudFlare creates an AI-generated series of fake web pages that look real enough to make a crawler scrape it, but end up getting it stuck in a “labyrinth” of links. The initial link maze is also hidden so only bots will be able to find it, ensuring no human visitors get trapped.
IN OTHER NEWS
Amazon the latest to bid for TikTok. The White House, which needs to sign off on a deal to prevent the app from being banned before April 5, is reportedly not taking the offer seriously. Amazon joins other groups that have claimed (with various credibility) that they’ll buy TikTok: those reportedly include Oracle and VC firm Andreessen Horowitz; internet advocacy organization Project Liberty, which includes Reddit co-founder Alexis Ohanian and billionaire Frank McCourt; Perplexity AI, which says it would reverse-engineer TikTok’s algorithm; and Zoop, a creator platform started by OnlyFans founder Tim Stokely. (The New York Times)
U.S. national security advisor is really bad at security. Mike Waltz, who has taken responsibility for accidentally inviting The Atlanic’s editor-in-chief to a Signal group chat discussing attack plans in Yemen, reportedly used his personal Gmail to discuss sensitive military info. This comes days after Waltz was found to have left his Venmo friend list public, showing connections to politicians, political operatives, members of the military, and Fox News journalists. (CNBC)
Ontario Securities Commission aims to crack down on Polymarket. Polymarket lets users wager on future events, like economic indicators, government decisions, and elections. The OSC claims these are “binary options,” an all-or-nothing bet that was banned in 2017. A hearing for a proposed settlement is scheduled for April 14. (Investment Executive)
Advertising giant is offering big incentives to get brands back on X. Omnicom Media Group — whose clients include Apple, PepsiCo, McDonald's, Nissan, Pfizer, and Unilever — is offering brands that did not spend on X last year up to US$200,000 worth of placements and services, in addition to hefty discounts on ad space. (Adweek)
CONSUMER TECH
Nintendo's realm is now a multiplayer battle
People are eagerly awaiting the Switch 2, but gamers have way more handheld options than they did in 2017.
Nintendo gave the first detailed look at the Switch 2, it’s long-awaited follow-up to the Switch, it’s best-selling home console of all time. Among the features:
A slightly bigger 1080p screen and 4K resolution when playing on a TV.
Controllers that can also be used like a mouse, which may make bringing PC games to the Switch 2 a bit easier.
A button that automatically opens up a Discord-like game chat, along with a camera attachment that can be used for video chat and streaming.
A custom Nvidia-made processor that is presumably more powerful than the Switch’s aging hardware, though no technical specs have been released.
Previously announced “Virtual Game Cards” that will not only let players bring Switch games to the new console, but let them lend out digital games like they would physical ones.
Big picture: When the Switch launched, it was the first to offer home console-level gaming in a portable format. Now, the handheld market is much more crowded with more capable competitors. The Steam Deck was an attempt to get PC gaming power in a handheld for people intimidated by the cost and technical knowledge needed to build a computer themselves. Since most games are available on PC, that offers a level of versatility Nintendo can’t match.
The Steam Deck’s success has gotten several computer makers to make their own effort at a gaming handheld, including Asus and Lenovo.
Both Sony and Microsoft are reportedly actively developing new handheld consoles to get in on the market.
It’s in the game: The first Switch’s hardware is not as powerful as competitors, so games requiring robust processing were not released on the console. But this became an inadvertent experiment that proved something many have long thought about the company — a Nintendo console could be successful on the strength of its owned franchises (and, to a lesser extent, indie games and re-releases that don’t need expensive hardware).
The Switch’s 50 top-selling games have sold over 624 million copies. All but six of them were published by Nintendo.
At launch, the Switch 2 will have new games in the Mario Kart, Metroid, Donkey Kong, and Pokémon franchises, as well as revamped versions from series like Legend of Zelda and Kirby.
On the third-party front, Nintendo landed Hollow Knight: Silksong — a sequel that has become a bit of a running joke for how long fans have been waiting for it — and The Duskbloods, a Switch 2 exclusive made by FromSoftware (FromSoftware’s Elden Ring, one of those hit games that wasn’t available for Switch, will get a Switch 2 version).
Money talks: The Switch 2 will cost US$450 when it’s released on June 5 (Canadian pricing has yet to be announced). The company has made a habit of pricing its consoles below competitors, but U.S. tariffs and inflation likely pushed the price up.
By comparison, the PlayStation 5 base model was US$499 at launch in 2020.
Possibly a more comparable device is the Steam Deck, which runs US$350 to $649, depending on the model, though its dock for playing on a monitor is sold separately.
Unrelated, but still neat: To hinder scalpers, Nintnedo is only making Switch 2 preorders available to people who played the first Switch for at least 50 hours as of today.
ALSO WORTH YOUR TIME
Uh, how is Meta (and other companies using LibGen’s pirated book database) able to train AI on unpublished books?
A fan believes he has been banned from Madison Square Garden after being picked out by facial recognition tech.
What U.S. officials can and cannot do with your devices when you cross the border — and how to protect your digital privacy.
Research shows male fruit flies get drunk to become more attractive to females.
A (still-growing) list of all the people involved with DOGE, and the private-sector companies they work for.
Learning nothing from the last three decades of internet use, Character.AI has implemented parental controls that are very easy for kids to bypass.