The fatal "soft underbelly" of start-up AI companies: not short of money but urgently short of "it"

**Source: **Financial Association

Edit Xiaoxiang

Image credit: Generated by Unbounded AI tools

As ChatGPT promotes the AIGC craze to rapidly heat up around the world, a large number of generative artificial intelligence startups are also springing up like mushrooms.

However, even if these startups can easily obtain billions of dollars in investment funds, there is still an Achilles heel, which is almost inevitable at present-that is, the lack of training data, which may ultimately be the key to the success of these startups. The biggest "stumbling block" on the road. **

Brad Svrluga, co-founder and general partner of venture capital firm Primary Venture Partners, said, "We have received many self-recommendations from start-up AI companies. Training data for powerful applications, not to mention proprietary data that can help them build a competitive moat in their business."

Data is more "rare" than money

Venture capital funding for generative AI startups has grown from $4.8 billion in 2022 to $12.7 billion in the first five months of 2023, according to PitchBook.

Now, many of these companies are looking to build more niche AI models in fields like finance or healthcare, where training datasets are not easy to come by.

** Paul Tyma, chief technology officer of Bullpen Capital, pointed out that building actual models has become commoditized to some extent, and the real value lies in the data. **

Some AI startups are targeting partnerships with large, data-rich enterprises. For example, Ernst & Young’s global vice chairperson responsible for taxation, Marna Ricker, said that because the company has a large amount of transaction data, generative artificial intelligence start-ups come to approach cooperation every day.

But Andy Baldwin, EY's global client services managing partner, said he was concerned about what would happen if EY's data were used to train external models.

"Who will own the data? When we train the model, what is our access to the model? How can other people use the model?" Baldwin said. "The data is part of our intellectual property."

Of course, startups can solve the intellectual property problem by training a different model for each customer, using only customer data. Startup TermSheet is using this strategy to build its Ethan product, a generative AI model capable of answering industry questions for real estate developers, brokers and investors.

But Roger Smith, chief executive and co-founder of TermSheet, said even getting customers to agree to that would take a lot of convincing.

** Andy Wilson, co-founder and CEO of legal technology company Logikcull, pointed out that how to convince companies that you have a strong network security capability and can actually protect this data is also a challenge. **

BIG COMPANIES HOLD HUGE ADVANTAGES

**Primary Venture Partners' Svrluga said big tech companies clearly have an advantage over startups when it comes to generative AI applications, partly because they have gained the trust of larger customers who feel more comfortable with their data processing rest assured. **

Tracy Daniels, chief data officer at financial services company Truist, said she is currently only exploring use cases for generative AI with large technology companies, not startups. She said she would trust the larger vendors to keep the data safe.

What this all means is that even startups that can get a head start with publicly available data face challenges in enriching their models with enterprise datasets.

Veesual is an artificial intelligence startup that generates images of what people look like when they try on clothes. The company initially trained its models primarily using public images on the internet, but has since struggled to get large retailers to agree to hand over their data to enhance the models.

Veesual CEO and co-founder Maxime Patte said that in some cases, large retailers even wanted Veesual to pay huge dividends or take an equity stake in the company in exchange for Veesual's right to use the data. These transactions were ultimately not negotiated. become.

PatentPal is a generative artificial intelligence startup that helps law firms draft patent applications. Its chief executive and founder, Jack Xu, also said the company initially could only train on publicly available patent filings.

The AI tool has the potential to become even more accurate if it continues to be trained with encrypted or anonymous feedback from actual customer cases, he said. But doing this is complicated because feedback must be kept separate from highly sensitive and confidential data, including trade secrets.

“For early-stage startups, there is a brand recognition issue, and there is also a social identity issue,” he said.

**At the same time, the "involution" between industries is becoming more and more intense. **Adam Struck, founder and managing partner of Struck Capital, said some startups are competing with each other to secure more data in certain areas, and faster.

“If you believe there’s a proprietary data set, you want to get it before everyone else, and you negotiate exclusivity,” he said. *In that sense, it’s almost become an arms race. *"

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)