In This Story

As ByteDance develops artificial intelligence models to compete in China, the bot it uses to scrape data to train those models is reportedly spiking in activity.

Suggested Reading

Suggested Reading

The TikTok owner launched its own web scraper, Bytespider, in April, and it’s now scraping data multiple times faster than bots from other companies, Fortune reported, citing research from Kasada, a bot management company, and Dark Visitors, a monitor of scraper bots. Companies developing AI models, such as Google (GOOGL) and Meta (META), use scraper bots to gather data to train and improve the large language models (LLMs) and multimodal models that power the companies’ AI services.

Advertisement

Bytespider is scraping web data about 25 times faster than OpenAI’s web scraper, GPTbot, Sam Crowther, CEO of Kasada, told Fortune. Compared with Anthropic’s ClaudeBot, Bytespider is 3,000 faster.

Advertisement

Like OpenAI’s and Anthropic’s bots, Bytespider ignores instructions from robots.txt, a non-legally binding line of code that tells web scrapers which data it can and cannot access on a website, Fortune reported. According to Kasada’s data, Bytespider has had spikes in scraping activity in the last six weeks.

Advertisement

“It’s like they’re trying desperately to catch up,” Crowther told Fortune.

ByteDance did not immediately respond to a request for comment.

The China-based company released its AI-powered chatbot, Doubao, last August, and it’s proving to be a tough competitor to homegrown rival Baidu’s (BIDU) Ernie Bot. In May, ByteDance launched a series of Doubao LLMs for enterprises, which cost less than models from the company’s Chinese competitors.

Advertisement

Now, ByteDance is planning to build a new AI model using chips from China’s Huawei, Reuters reported, citing three unnamed people familiar with the matter. However, a spokesperson for ByteDance previously told Quartz the company is not developing a new AI model.

The company has also designed two AI chips with Taiwan Semiconductor Manufacturing Company (TSM) that ByteDance plans to mass produce by 2026, The Information reported, citing unnamed people familiar with the matter. By producing its own chips, the company could become less dependent on Nvidia’s (NVDA) pricey graphics processing units, or GPUs, which are subject to U.S. export controls, people told The Information.