Skip to content Skip to footer

ChatGPT: Two Years Later. Tracing the impact of the generative AI… | by Julián Peller | Nov, 2024


This November 30 marks the second anniversary of ChatGPT’s launch, an event that sent shockwaves through technology, society, and the economy. The space opened by this milestone has not always made it easy — or perhaps even possible — to separate reality from expectations. For example, this year Nvidia became the most valuable public company in the world during a stunning bullish rally. The company, which manufactures hardware used by models like ChatGPT, is now worth seven times what it was two years ago. The obvious question for everyone is: Is it really worth that much, or are we in the midst of collective delusion? This question — and not its eventual answer — defines the current moment.

AI is making waves not just in the stock market. Last month, for the first time in history, prominent figures in artificial intelligence were awarded the Nobel Prizes in Physics and Chemistry. John J. Hopfield and Geoffrey E. Hinton received the Physics Nobel for their foundational contributions to neural network development.

In Chemistry, Demis Hassabis, John Jumper, and David Baker were recognized for AlphaFold’s advances in protein design using artificial intelligence. These awards generated surprise on one hand and understandable disappointment among traditional scientists on the other, as computational methods took center stage.

ChatGPT was launched on November 30, 2022 (Photo by Rolf van Root on Unsplash).

In this context, I aim to review what has happened since that November, reflecting on the tangible and potential impact of generative AI to date, considering which promises have been fulfilled, which remain in the running, and which seem to have fallen by the wayside.

Let’s begin by recalling the day of the launch. ChatGPT 3.5 was a chatbot far superior to anything previously known in terms of discourse and intelligence capabilities. The difference between what was possible at the time and what ChatGPT could do generated enormous fascination and the product went viral rapidly: it reached 100 million users in just two months, far surpassing many applications considered viral (TikTok, Instagram, Pinterest, Spotify, etc.). It also entered mass media and public debate: AI landed in the mainstream, and suddenly everyone was talking about ChatGPT. To top it off, just a few months later, OpenAI launched GPT-4, a model vastly superior to 3.5 in intelligence and also capable of understanding images.

The situation sparked debates about the many possibilities and problems inherent to this specific technology, including copyright, misinformation, productivity, and labor market issues. It also raised concerns about the medium- and long-term risks of advancing AI research, such as existential risk (the “Terminator” scenario), the end of work, and the potential for artificial consciousness. In this broad and passionate discussion, we heard a wide range of opinions. Over time, I believe the debate began to mature and temper. It took us a while to adapt to this product because ChatGPT’s advancement left us all somewhat offside. What has happened since then?

As far as technology companies are concerned, these past two years have been a roller coaster. The appearance on the scene of OpenAI, with its futuristic advances and its CEO with a “startup” spirit and look, raised questions about Google’s technological leadership, which until then had been undisputed. Google, for its part, did everything it could to confirm these doubts, repeatedly humiliating itself in public. First came the embarrassment of Bard’s launch — the chatbot designed to compete with ChatGPT. In the demo video, the model made a factual error: when asked about the James Webb Space Telescope, it claimed it was the first telescope to photograph planets outside the solar system, which is false. This misstep caused Google’s stock to drop by 9% in the following week. Later, during the presentation of its new Gemini model — another competitor, this time to GPT-4 — Google lost credibility again when it was revealed that the incredible capabilities showcased in the demo (which could have placed it at the cutting edge of research) were, in reality, fabricated, based on much more limited capabilities.

The day Goliath stumbled (Photo by Shutter Speed on Unsplash).

Meanwhile, Microsoft — the archaic company of Bill Gates that produced the old Windows 95 and was as hated by young people as Google was loved — reappeared and allied with the small David, integrating ChatGPT into Bing and presenting itself as agile and defiant. “I want people to know we made them dance,” said Satya Nadella, Microsoft’s CEO, referring to Google. In 2023, Microsoft rejuvenated while Google aged.

This situation persisted, and OpenAI remained for some time the undisputed leader in both technical evaluations and subjective user feedback (known as “vibe checks”), with GPT-4 at the forefront. But over time, this changed and just as GPT-4 had achieved unique leadership by late 2022, by mid-2024 its close successor (GPT-4o) was competing with others of its caliber: Google’s Gemini 1.5 Pro, Anthropic’s Claude Sonnet 3.5, and xAI’s Grok 2. What innovation gives, innovation takes away.

This scenario could be shifting again with OpenAI’s recent announcement of o1 in September 2024 and rumors of new launches in December. For now, however, regardless of how good o1 may be (we’ll talk about it shortly), it doesn’t seem to have caused the same seismic impact as ChatGPT or conveyed the same sense of an unbridgeable gap with the rest of the competitive landscape.

To round out the scene of hits, falls, and epic comebacks, we must talk about the open-source world. This new AI era began with two gut punches to the open-source community. First, OpenAI, despite what its name implies, was a pioneer in halting the public disclosure of fundamental technological advancements. Before OpenAI, the norms of artificial intelligence research — at least during the golden era before 2022 — entailed detailed publication of research findings. During that period, major corporations fostered a positive feedback loop with academia and published papers, something previously uncommon. Indeed, ChatGPT and the generative AI revolution as a whole are based on a 2017 paper from Google, the famous Attention Is All You Need, which introduced the Transformer neural network architecture. This architecture underpins all current language models and is the “T” in GPT. In a dramatic plot twist, OpenAI leveraged this public discovery by Google to gain an advantage and began pursuing closed-door research, with GPT-4’s launch marking the turning point between these two eras: OpenAI disclosed nothing about the inner workings of this advanced model. From that moment, many closed models, such as Gemini 1.5 Pro and Claude Sonnet, began to emerge, fundamentally shifting the research ecosystem for the worse.

The second blow to the open-source community was the sheer scale of the new models. Until GPT-2, a modest GPU was sufficient to train deep learning models. Starting with GPT-3, infrastructure costs skyrocketed, and training models became inaccessible to individuals or most institutions. Fundamental advancements fell into the hands of a few major players.

But after these blows, and with everyone anticipating a knockout, the open-source world fought back and proved itself capable of rising to the occasion. For everyone’s benefit, it had an unexpected champion. Mark Zuckerberg, the most hated reptilian android on the planet, made a radical change of image by positioning himself as the flagbearer of open source and freedom in the generative AI field. Meta, the conglomerate that controls much of the digital communication fabric of the West according to its own design and will, took on the task of bringing open source into the LLM era with its LLaMa model line. It’s definitely a bad time to be a moral absolutist. The LLaMa line began with timid open licenses and limited capabilities (although the community made significant efforts to believe otherwise). However, with the recent releases of LLaMa 3.1 and 3.2, the gap with private models has begun to narrow significantly. This has allowed the open-source world and public research to remain at the forefront of technological innovation.

LLaMa models are open source alternatives to closed-source corporate LLMs (Photo by Paul Lequay on Unsplash).

Over the past two years, research into ChatGPT-like models, known as large language models (LLMs), has been prolific. The first fundamental advancement, now taken for granted, is that companies managed to increase the context windows of models (how many words they can read as input and generate as output) while dramatically reducing costs per word. We’ve also seen models become multimodal, accepting not only text but also images, audio, and video as input. Additionally, they have been enabled to use tools — most notably, internet search — and have steadily improved in overall capacity.

On another front, various quantization and distillation techniques have emerged, enabling the compression of enormous models into smaller versions, even to the point of running language models on desktop computers (albeit sometimes at the cost of unacceptable performance reductions). This optimization trend appears to be on a positive trajectory, bringing us closer to small language models (SLMs) that could eventually run on smartphones.

On the downside, no significant progress has been made in controlling the infamous hallucinations — false yet plausible-sounding outputs generated by models. Once a quaint novelty, this issue now seems confirmed as a structural feature of the technology. For those of us who use this technology in our daily work, it’s frustrating to rely on a tool that behaves like an expert most of the time but commits gross errors or outright fabricates information roughly one out of every ten times. In this sense, Yann LeCun, the head of Meta AI and a major figure in AI, seems vindicated, as he had adopted a more deflationary stance on LLMs during the 2023 hype peak.

However, pointing out the limitations of LLMs doesn’t mean the debate is settled about what they’re capable of or where they might take us. For instance, Sam Altman believes the current research program still has much to offer before hitting a wall, and the market, as we’ll see shortly, seems to agree. Many of the advancements we’ve seen over the past two years support this optimism. OpenAI launched its voice assistant and an improved version capable of near-real-time interaction with interruptions — like human conversations rather than turn-taking. More recently, we’ve seen the first advanced attempts at LLMs gaining access to and control over users’ computers, as demonstrated in the GPT-4o demo (not yet released) and in Claude 3.5, which is available to end users. While these tools are still in their infancy, they offer a glimpse of what the near future could look like, with LLMs having greater agency. Similarly, there have been numerous breakthroughs in automating software engineering, highlighted by debatable milestones like Devin, the first “artificial software engineer.” While its demo was heavily criticized, this area — despite the hype — has shown undeniable, impactful progress. For example, in the SWE-bench benchmark, used to evaluate AI models’ abilities to solve software engineering problems, the best models at the start of the year could solve less than 13% of exercises. As of now, that figure exceeds 49%, justifying confidence in the current research program to enhance LLMs’ planning and complex task-solving capabilities.

Along the same lines, OpenAI’s recent announcement of the o1 model signals a new line of research with significant potential, despite the currently released version (o1-preview) not being far ahead from what’s already known. In fact, o1 is based on a novel idea: leveraging inference time — not training time — to improve the quality of generated responses. With this approach, the model doesn’t immediately produce the most probable next word but has the ability to “pause to think” before responding. One of the company’s researchers suggested that, eventually, these models could use hours or even days of computation before generating a response. Preliminary results have sparked high expectations, as using inference time to optimize quality was not previously considered viable. We now await subsequent models in this line (o2, o3, o4) to confirm whether it is as promising as it currently seems.

Beyond language models, these two years have seen enormous advancements in other areas. First, we must mention image generation. Text-to-image models began to gain traction even before chatbots and have continued developing at an accelerated pace, expanding into video generation. This field reached a high point with the introduction of OpenAI’s Sora, a model capable of producing extremely high-quality videos, though it was not released. Slightly less known but equally impressive are advances in music generation, with platforms like Suno and Udio, and in voice generation, which has undergone a revolution and achieved extraordinarily high-quality standards, led by Eleven Labs.

It has undoubtedly been two intense years of remarkable technological progress and almost daily innovations for those of us involved in the field.

If we turn our attention to the financial aspect of this phenomenon, we will see vast amounts of capital being poured into the world of AI in a sustained and growing manner. We are currently in the midst of an AI gold rush, and no one wants to be left out of a technology that its inventors, modestly, have presented as equivalent to the steam engine, the printing press, or the internet.

It may be telling that the company that has capitalized the most on this frenzy doesn’t sell AI but rather the hardware that serves as its infrastructure, aligning with the old adage that during a gold rush, a good way to get rich is by selling shovels and picks. As mentioned earlier, Nvidia has positioned itself as the most valuable company in the world, reaching a market capitalization of $3.5 trillion. For context, $3,500,000,000,000 is a figure far greater than France’s GDP.

We are currently in the midst of an AI gold rush, and no one wants to be left out (Photo by Dimitri Karastelev on Unsplash).

On the other hand, if we look at the list of publicly traded companies with the highest market value, we see tech giants linked partially or entirely to AI promises dominating the podium. Apple, Nvidia, Microsoft, and Google are the top four as of the date of this writing, with a combined capitalization exceeding $12 trillion. For reference, in November 2022, the combined capitalization of these four companies was less than half of this value. Meanwhile, generative AI startups in Silicon Valley are raising record-breaking investments. The AI market is bullish.

While the technology advances fast, the business model for generative AI — beyond the major LLM providers and a few specific cases — remains unclear. As this bullish frenzy continues, some voices, including recent economics Nobel laureate Daron Acemoglu, have expressed skepticism about AI’s ability to justify the massive amounts of money being poured into it. For instance, in this Bloomberg interview, Acemoglu argues that current generative AI will only be able to automate less than 5% of existing tasks in the next decade, making it unlikely to spark the productivity revolution investors anticipate.

Is this AI fever or rather AI feverish delirium? For now, the bullish rally shows no signs of stopping, and like any bubble, it will be easy to recognize in hindsight. But while we’re in it, it’s unclear if there will be a correction and, if so, when it might happen. Are we in a bubble about to burst, as Acemoglu believes, or, as one investor suggested, is Nvidia on its way to becoming a $50 trillion company within a decade? This is the million-dollar question and, unfortunately, dear reader, I do not know the answer. Everything seems to indicate that, just like in the dot com bubble, we will emerge from this situation with some companies riding the wave and many underwater.

Let’s now discuss the broader social impact of generative AI’s arrival. The leap in quality represented by ChatGPT, compared to the socially known technological horizon before its launch, caused significant commotion, opening debates about the opportunities and risks of this specific technology, as well as the potential opportunities and risks of more advanced technological developments.

The problem of the future
The debate over the proximity of artificial general intelligence (AGI) — AI reaching human or superhuman capabilities — gained public relevance when Geoffrey Hinton (now a Physics Nobel laureate) resigned from his position at Google to warn about the risks such development could pose. Existential risk — the possibility that a super-capable AI could spiral out of control and either annihilate or subjugate humanity — moved out of the realm of fiction to become a concrete political issue. We saw prominent figures, with moderate and non-alarmist profiles, express concern in public debates and even in U.S. Senate hearings. They warned of the possibility of AGI arriving within the next ten years and the enormous problems this would entail.

The urgency that surrounded this debate now seems to have faded, and in hindsight, AGI appears further away than it did in 2023 (Photo by Axel Richter on Unsplash).

The urgency that surrounded this debate now seems to have faded, and in hindsight, AGI appears further away than it did in 2023. It’s common to overestimate achievements immediately after they occur, just as it’s common to underestimate them over time. This latter phenomenon even has a name: the AI Effect, where major advancements in the field lose their initial luster over time and cease to be considered “true intelligence.” If today the ability to generate coherent discourse — like the ability to play chess — is no longer surprising, this should not distract us from the timeline of progress in this technology. In 1996, the Deep Blue model defeated chess champion Garry Kasparov. In 2016, AlphaGo defeated Go master Lee Sedol. And in 2022, ChatGPT produced high-quality, articulated speech, even challenging the famous Turing Test as a benchmark for determining machine intelligence. I believe it’s important to sustain discussions about future risks even when they no longer seem imminent or urgent. Otherwise, cycles of fear and calm prevent mature debate. Whether through the research direction opened by o1 or new pathways, it’s likely that within a few years, we’ll see another breakthrough on the scale of ChatGPT in 2022, and it would be wise to address the relevant discussions before that happens.

A separate chapter on AGI and AI safety involves the corporate drama at OpenAI, worthy of prime-time television. In late 2023, Sam Altman was abruptly removed by the board of directors. Although the full details were never clarified, Altman’s detractors pointed to an alleged culture of secrecy and disagreements over safety issues in AI development. The decision sparked an immediate rebellion among OpenAI employees and drew the attention of Microsoft, the company’s largest investor. In a dramatic twist, Altman was reinstated, and the board members who removed him were dismissed. This conflict left a rift within OpenAI: Jan Leike, the head of AI safety research, joined Anthropic, while Ilya Sutskever, OpenAI’s co-founder and a central figure in its AI development, departed to create Safe Superintelligence Inc. This seems to confirm that the original dispute centered around the importance placed on safety. To conclude, recent rumors suggest OpenAI may lose its nonprofit status and grant shares to Altman, triggering another wave of resignations within the company’s leadership and intensifying a sense of instability.

From a technical perspective, we saw a significant breakthrough in AI safety from Anthropic. The company achieved a fundamental milestone in LLM interpretability, helping to better understand the “black box” nature of these models. Through their discovery of the polysemantic nature of neurons and a method for extracting neural activation patterns representing concepts, the primary barrier to controlling Transformer models seems to have been broken — at least in terms of their potential to deceive us. The ability to deliberately alter circuits actively modifying the observable behavior in these models is also promising and brought some peace of mind regarding the gap between the capabilities of the models and our understanding of them.

The problems of the present
Setting aside the future of AI and its potential impacts, let’s focus on the tangible effects of generative AI. Unlike the arrival of the internet or social media, this time society seemed to react quickly, demonstrating concern about the implications and challenges posed by this new technology. Beyond the deep debate on existential risks mentioned earlier — centered on future technological development and the pace of progress — the impacts of existing language models have also been widely discussed. The main issues with generative AI include the fear of amplifying misinformation and digital pollution, significant problems with copyright and private data use, and the impact on productivity and the labor market.

Regarding misinformation, this study suggests that, at least for now, there hasn’t been a significant increase in exposure to misinformation due to generative AI. While this is difficult to confirm definitively, my personal impressions align: although misinformation remains prevalent — and may have even increased in recent years — it hasn’t undergone a significant phase change attributable to the emergence of generative AI. This doesn’t mean misinformation isn’t a critical issue today. The weaker thesis here is that generative AI doesn’t seem to have significantly worsened the problem — at least not yet.

However, we have seen instances of deep fakes, such as recent cases involving AI-generated pornographic material using real people’s faces, and more seriously, cases in schools where minors — particularly young girls — were affected. These cases are extremely serious, and it’s crucial to bolster judicial and law enforcement systems to address them. However, they appear, at least preliminarily, to be manageable and, in the grand scheme, represent relatively minor impacts compared to the speculative nightmare of misinformation fueled by generative AI. Perhaps legal systems will take longer than we would like, but there are signs that institutions may be up to the task at least as far as deep fakes of underage porn are concerned, as illustrated by the exemplary 18-year sentence received by a person in the United Kingdom for creating and distributing this material.

Secondly, concerning the impact on the labor market and productivity — the flip side of the market boom — the debate remains unresolved. It’s unclear how far this technology will go in increasing worker productivity or in reducing or increasing jobs. Online, one can find a wide range of opinions about this technology’s impact. Claims like “AI replaces tasks, not people” or “AI won’t replace you, but a person using AI will” are made with great confidence yet without any supporting evidence — something that ironically recalls the hallucinations of a language model. It’s true that ChatGPT cannot perform complex tasks, and those of us who use it daily know its significant and frustrating limitations. But it’s also true that tasks like drafting professional emails or reviewing large amounts of text for specific information have become much faster. In my experience, productivity in programming and data science has increased significantly with AI-assisted programming environments like Copilot or Cursor. In my team, junior profiles have gained greater autonomy, and everyone produces code faster than before. That said, the speed in code production could be a double-edged sword, as some studies suggest that code generated with generative AI assistants may be of lower quality than code written by humans without such assistance.

If the impact of current LLMs isn’t entirely clear, this uncertainty is compounded by significant advancements in associated technologies, such as the research line opened by o1 or the desktop control anticipated by Claude 3.5. These developments increase the uncertainty about the capabilities these technologies could achieve in the short term. And while the market is betting heavily on a productivity boom driven by generative AI, many serious voices downplay the potential impact of this technology on the labor market, as noted earlier in the discussion of the financial aspect of the phenomenon. In principle, the most significant limitations of this technology (e.g., hallucinations) have not only remained unresolved but now seem increasingly unlikely to be resolved. Meanwhile, human institutions have proven less agile and revolutionary than the technology itself, cooling the conversation and dampening the enthusiasm of those envisioning a massive and immediate impact.

In any case, the promise of a massive revolution in the workplace, if it is to materialize, has not yet materialized in at least these two years. Considering the accelerated adoption of this technology (according to this study, more than 24% of American workers today use generative AI at least once a week) and assuming that the first to adopt it are perhaps those who find the greatest benefits, we can think that we have already seen enough of the productivity impact of this technology. In terms of my professional day-to-day and that of my team, the productivity impacts so far, while noticeable, significant, and visible, have also been modest.

Another major challenge accompanying the rise of generative AI involves copyright issues. Content creators — including artists, writers, and media companies — have expressed dissatisfaction over their works being used without authorization to train AI models, which they consider a violation of their intellectual property rights. On the flip side, AI companies often argue that using protected material to train models is covered under “fair use” and that the production of these models constitutes legitimate and creative transformation rather than reproduction.

This conflict has resulted in numerous lawsuits, such as Getty Images suing Stability AI for the unauthorized use of images to train models, or lawsuits by artists and authors, like Sarah Silverman, against OpenAI, Meta, and other AI companies. Another notable case involves record companies suing Suno and Udio, alleging copyright infringement for using protected songs to train generative music models.

In this futuristic reinterpretation of the age-old divide between inspiration and plagiarism, courts have yet to decisively tip the scales one way or the other. While some aspects of these lawsuits have been allowed to proceed, others have been dismissed, maintaining an atmosphere of uncertainty. Recent legal filings and corporate strategies — such as Adobe, Google, and OpenAI indemnifying their clients — demonstrate that the issue remains unresolved, and for now, legal disputes continue without a definitive conclusion.

The use of artificial intelligence in the EU will be regulated by the AI ​​Act, the world’s first comprehensive AI law (Photo by Guillaume Périgois on Unsplash).

The regulatory framework for AI has also seen significant progress, with the most notable development on this side of the globe being the European Union’s approval of the AI Act in March 2024. This legislation positioned Europe as the first bloc in the world to adopt a comprehensive regulatory framework for AI, establishing a phased implementation system to ensure compliance, set to begin in February 2025 and proceed gradually.

The AI Act classifies AI risks, prohibiting cases of “unacceptable risk,” such as the use of technology for deception or social scoring. While some provisions were softened during discussions to ensure basic rules applicable to all models and stricter regulations for applications in sensitive contexts, the industry has voiced concerns about the burden this framework represents. Although the AI Act wasn’t a direct consequence of ChatGPT and had been under discussion beforehand, its approval was accelerated by the sudden emergence and impact of generative AI models.

With these tensions, opportunities, and challenges, it’s clear that the impact of generative AI marks the beginning of a new phase of profound transformations across social, economic, and legal spheres, the full extent of which we are only beginning to understand.

I approached this article thinking that the ChatGPT boom had passed and its ripple effects were now subsiding, calming. Reviewing the events of the past two years convinced me otherwise: they’ve been two years of great progress and great speed.

These are times of excitement and expectation — a true springtime for AI — with impressive breakthroughs continuing to emerge and promising research lines waiting to be explored. On the other hand, these are also times of uncertainty. The suspicion of being in a bubble and the expectation of a significant emotional and market correction are more than reasonable. But as with any market correction, the key isn’t predicting if it will happen but knowing exactly when.

What will happen in 2025? Will Nvidia’s stock collapse, or will the company continue its bullish rally, fulfilling the promise of becoming a $50 trillion company within a decade? And what will happen to the AI stock market in general? And what will become of the reasoning model research line initiated by o1? Will it hit a ceiling or start showing progress, just as the GPT line advanced through versions 1, 2, 3, and 4? How much will today’s rudimentary LLM-based agents that control desktops and digital environments improve overall?

We’ll find out sooner rather than later, because that’s where we’re headed.

Happy birthday, ChatGPT! (Photo by Nick Stephenson on Unsplash)



Source link

Leave a comment

0.0/5