📢 Exclusive on Gate Square — #PROVE Creative Contest# is Now Live!
CandyDrop × Succinct (PROVE) — Trade to share 200,000 PROVE 👉 https://www.gate.com/announcements/article/46469
Futures Lucky Draw Challenge: Guaranteed 1 PROVE Airdrop per User 👉 https://www.gate.com/announcements/article/46491
🎁 Endless creativity · Rewards keep coming — Post to share 300 PROVE!
📅 Event PeriodAugust 12, 2025, 04:00 – August 17, 2025, 16:00 UTC
📌 How to Participate
1.Publish original content on Gate Square related to PROVE or the above activities (minimum 100 words; any format: analysis, tutorial, creativ
The fatal "soft underbelly" of start-up AI companies: not short of money but urgently short of "it"
**Source: **Financial Association
Edit Xiaoxiang
As ChatGPT promotes the AIGC craze to rapidly heat up around the world, a large number of generative artificial intelligence startups are also springing up like mushrooms.
However, even if these startups can easily obtain billions of dollars in investment funds, there is still an Achilles heel, which is almost inevitable at present-that is, the lack of training data, which may ultimately be the key to the success of these startups. The biggest "stumbling block" on the road. **
Brad Svrluga, co-founder and general partner of venture capital firm Primary Venture Partners, said, "We have received many self-recommendations from start-up AI companies. Training data for powerful applications, not to mention proprietary data that can help them build a competitive moat in their business."
Data is more "rare" than money
Venture capital funding for generative AI startups has grown from $4.8 billion in 2022 to $12.7 billion in the first five months of 2023, according to PitchBook.
Now, many of these companies are looking to build more niche AI models in fields like finance or healthcare, where training datasets are not easy to come by.
** Paul Tyma, chief technology officer of Bullpen Capital, pointed out that building actual models has become commoditized to some extent, and the real value lies in the data. **
Some AI startups are targeting partnerships with large, data-rich enterprises. For example, Ernst & Young’s global vice chairperson responsible for taxation, Marna Ricker, said that because the company has a large amount of transaction data, generative artificial intelligence start-ups come to approach cooperation every day.
But Andy Baldwin, EY's global client services managing partner, said he was concerned about what would happen if EY's data were used to train external models.
"Who will own the data? When we train the model, what is our access to the model? How can other people use the model?" Baldwin said. "The data is part of our intellectual property."
Of course, startups can solve the intellectual property problem by training a different model for each customer, using only customer data. Startup TermSheet is using this strategy to build its Ethan product, a generative AI model capable of answering industry questions for real estate developers, brokers and investors.
But Roger Smith, chief executive and co-founder of TermSheet, said even getting customers to agree to that would take a lot of convincing.
** Andy Wilson, co-founder and CEO of legal technology company Logikcull, pointed out that how to convince companies that you have a strong network security capability and can actually protect this data is also a challenge. **
BIG COMPANIES HOLD HUGE ADVANTAGES
**Primary Venture Partners' Svrluga said big tech companies clearly have an advantage over startups when it comes to generative AI applications, partly because they have gained the trust of larger customers who feel more comfortable with their data processing rest assured. **
Tracy Daniels, chief data officer at financial services company Truist, said she is currently only exploring use cases for generative AI with large technology companies, not startups. She said she would trust the larger vendors to keep the data safe.
What this all means is that even startups that can get a head start with publicly available data face challenges in enriching their models with enterprise datasets.
Veesual is an artificial intelligence startup that generates images of what people look like when they try on clothes. The company initially trained its models primarily using public images on the internet, but has since struggled to get large retailers to agree to hand over their data to enhance the models.
Veesual CEO and co-founder Maxime Patte said that in some cases, large retailers even wanted Veesual to pay huge dividends or take an equity stake in the company in exchange for Veesual's right to use the data. These transactions were ultimately not negotiated. become.
PatentPal is a generative artificial intelligence startup that helps law firms draft patent applications. Its chief executive and founder, Jack Xu, also said the company initially could only train on publicly available patent filings.
The AI tool has the potential to become even more accurate if it continues to be trained with encrypted or anonymous feedback from actual customer cases, he said. But doing this is complicated because feedback must be kept separate from highly sensitive and confidential data, including trade secrets.
“For early-stage startups, there is a brand recognition issue, and there is also a social identity issue,” he said.
**At the same time, the "involution" between industries is becoming more and more intense. **Adam Struck, founder and managing partner of Struck Capital, said some startups are competing with each other to secure more data in certain areas, and faster.
“If you believe there’s a proprietary data set, you want to get it before everyone else, and you negotiate exclusivity,” he said. *In that sense, it’s almost become an arms race. *"