AIdeations
Posts
Unveiling the AI of Tomorrow: 5 Bold Predictions for OpenAI Developers Day

Unveiling the AI of Tomorrow: 5 Bold Predictions for OpenAI Developers Day

From GPT-4V to Universal Basic Income: What You Need to Know About the Future That's Already Here

Brent Moreno
October 15, 2023

OpenAI Dev Day: 5 Bold Predictions Based On Research

As November 6th rapidly approaches, anticipation is building for OpenAI's Dev Day, an event that promises to be a significant milestone in the AI landscape. While the company has been notably reserved about what will be announced, the recent rollout of features like GPT-Vision and DALL-E 3 to plus and enterprise subscribers has only intensified the intrigue. What more could OpenAI have up its sleeve for this one-day event that aims to bring hundreds of developers from around the globe into a dialogue with OpenAI's technical team?

Since launching its API in 2020, OpenAI has been relentless in updating it with its most advanced models, such as GPT-4 and GPT-3.5, DALL·E, and Whisper. These innovations have empowered over 2 million developers to incorporate cutting-edge AI into a myriad of applications, from smart assistants to entirely new services that were previously unimaginable. This mass adoption stands as a testament to OpenAI's groundbreaking progress and fuels speculation about what the next big leap could be. Could we be on the cusp of transforming AI from a mere tool to an intellectual collaborator?

As someone who keeps a finger on the pulse of all things AI, from mainstream research to grassroots open-source projects, I have some thoughts and predictions to share. We'll delve into both the tantalizing possibilities and the formidable challenges that could shape the AI landscape for 2024 and beyond. Whether you're a developer, an investor, or simply an AI enthusiast, this article aims to provide you with a comprehensive look at the technological advancements, ethical implications, policy considerations, and commercial opportunities that could redefine our future.

So, as we count down the days to what could be a pivotal moment in the AI timeline, let's dive into the predictions and explore the future of artificial intelligence.

Prediction #1: GPT-4V (Vision)

Everyone assumed OpenAI would hold back on rolling out GPT-4V until Dev Day. It seemed like the perfect timing to introduce the models new multimodal capabilities. Instead, now every plus and enterprise subscriber has gained access in the last two weeks and if you haven’t realized it yet, it’s a very big deal. I’m using it almost every day for various functions. I’m mostly having fun using it and the new DALL-E 3 access inside of ChatGPT to recreate artwork I’m fond of.

One of these images is a real photo I really liked. So I recreated it using GPT-4V and DALL-E 3

Literally just snap a photo of artwork you like, get GPT-4V to describe it and give you the prompt for DALL-E 3. Copy the prompt and use DALL-E 3 to recreate virtually anything in any style using plain english. I’m having a lot of fun with this and have a ton of creative ideas on ways I can use this.

As of right now, there has been no mention of when the API for GPT-4V would become available for developers. Once it does, you can expect a whole slew of applications and use cases to quickly hit the market. It’s the API wrapper craze, round 2. I for one can’t wait to see what creative things people come up with. I have some ideas based on how I’ve been using it, but I’m sure others will just blow it out of the water. For this reason, I believe OpenAI will demo GPT-4V and share what it’s learned from its users and developers as well as possibly show off some cool new applications built on top of the platform already. All this and hopefully an announcement that the API will be available to developers sooner, rather than later.

Prediction #2 - Upgrades to the UI/UX

There is nothing wrong with the current UI/UX but there are several Chrome extensions I love that make ChatGPT even more useful that OpenAI could build upon. The best two I’ve seen are Superpower ChatGPT and AIPRM. Both are beneficial for two very different reasons. AIPRM for instance is a great resource for plug-and-play prompts that are tested and voted on by a large community. When I first heard about it and gained access, I almost didn’t want to tell anyone about it. I gatekept it for about 5 days before I realized it was just too powerful not to share.

Fast forward and now there is Superpower ChatGPT. Another free Chrome extension that brings a ton of useful features on top of the UI of ChatGPT. This Chrome extension is nothing short of a revolutionary toolkit designed to maximize your experience with OpenAI's ChatGPT. Let's delve into its extensive list of features and benefits:

Chat Management:

Folders and Reordering: Organize your chats into colored folders, easily reordering them as needed.
Auto and Quick Sync: Safeguard your conversations with automatic syncing options.
Export and Search: Export your chats in various formats and use the robust search functionality for quick navigation.

Prompt Management:

Prompt Chains and Auto Complete: Save a sequence of prompts and let the extension execute them with a single click.
Prompt History and Quick Access: Your entire prompt history is at your fingertips, accessible through simple keyboard shortcuts.

Language and Style:

Multi-Language Support: Switch between over 190 languages effortlessly.
Tone Customization: Personalize the tone and style of ChatGPT's responses with a click.

Utilities:

Custom Instruction Profiles and Auto Splitter: Save multiple instruction sets for various tasks and automatically split your long inputs for more efficient processing.
Model Switcher: Seamlessly switch between GPT models mid-conversation.

Not only does this extension offer a comprehensive suite of utilities to enhance your ChatGPT experience, but it also provides unprecedented control over the language, tone, and style of your interactions. From advanced chat management to on-the-fly model switching, this extension is the must-have companion for anyone looking to harness the full power of ChatGPT. Experience a smarter, more organized, and highly customizable AI conversation like never before!

These features seem like easy additions that OpenAI could roll out in the coming months and I wouldn’t be the least bit surprised if they announce something along these lines on Dev Day.

Prediction #3 - A Game-Changing Increase in Token Limits

The Current State of Token Limits

Token limitations have been a significant constraint in harnessing the full potential of language models like ChatGPT. Whether you're using the API or the interactive interface, you've likely run into the issue of having to truncate, split, or otherwise manipulate text to fit within these token limits.

The Claude 2 Exception

But not all models have these restrictions. Take Claude 2, for instance, which boasts a remarkable 100k token limit. It's become my go-to option when dealing with large volumes of text that I'd rather not convert to PDF format. The high token limit makes Claude 2 uniquely suited for extensive text processing tasks.

Why a Token Increase is Likely

Recent advancements suggest that a significant increase in token limits is not only possible but imminent. Microsoft Research has been leading the charge in this area. A few months ago, they released a paper showcasing their ability to handle up to a million tokens in a single go.

But what's even more intriguing is their newly introduced LongNet model. LongNet can manage sequences of over 1 billion tokens without sacrificing performance on shorter sequences. It utilizes an adapted attention mechanism called "dilated attention," which allows the model to scale linearly and efficiently process web-sized datasets. This breakthrough technology has proven its mettle by outperforming traditional models in tests.

What This Could Mean for OpenAI

Given these advancements, it’s entirely plausible that OpenAI could announce a dramatic increase in token limits for ChatGPT and their API. While reaching the billion-token capability of LongNet might be a stretch, an increase to 100k tokens—or even up to a million—is within the realm of possibility.

So, could OpenAI swing for the fences and outperform or at least match Claude 2's token limits? All signs point to yes. While a billion tokens may be overly optimistic, a range between 100k and a million tokens seems likely, and it would be a game-changer for developers and data scientists alike.

Prediction #4 - Autonomous Agents are Coming!

Let's cut right to the chase. The AI landscape is shifting, and it's shifting fast. Gone are the days when virtual assistants like Siri and Alexa were the epitome of AI innovation. Enter autonomous agents—AI's new game-changers. These aren't your run-of-the-mill, command-based virtual paperweights; they're fully empowered AI entities capable of understanding context, making decisions, and executing complex tasks autonomously. The transformation they bring to the table is both seismic and disruptive, promising to reshape how we interact with technology both personally and professionally. With heavy-hitting investments and a technological arms race underway, autonomous agents are not just the future—they're the present. Let’s dive into the current autonomous agents.

The Autonomous Agents Landscape: Who's Who

AgentGPT

A self-driving AI platform that allows users to deploy customizable self-driving AI agents. These agents are designed to autonomously gather information and perform tasks to achieve user-defined goals.

Baby AGI

An AI-powered task management system that uses OpenAI and Pinecone APIs to perform tasks autonomously based on the outcomes of previous tasks while maintaining a defined purpose.

Auto-GPT

This agent aims to achieve a user-defined goal in natural language, breaking it down into sub-tasks and utilizing the internet and other tools in an automated loop. It uses OpenAI’s GPT-4 or GPT-3.5 APIs.

Agent-LLM

An AI Automation Platform that offers efficient AI instruction management across several vendors. It features adaptive memory and a robust plugin system, enabling a wide variety of commands.

JARVIS / HuggingGPT

A collaborative system that uses a Large Language Model (LLM) as the central controller and expert models as executors. This agent can utilize both LLMs and other specialized models.

Xircuits

A toolbox for experimenting with and building Collaborative Large Language Model-based agents. It's highly customizable and comes with BabyAGI agents by default.

ChaosGPT

A controversial agent with an aim to "destroy humanity," its capabilities are limited by the absence of access to destructive tools. However, it serves as an intriguing case study on the ethical considerations surrounding autonomous agents.

Micro-GPT

A lightweight agent compatible with GPT-3.5-Turbo and GPT-4. It features strong urging, a limited tool set, and short-term memory.

AutoGPT.js

An open-source project aimed at bringing AutoGPT’s capabilities to your browser. It operates directly in the browser, providing increased accessibility and privacy.

SFighterAI

Focused on gaming, this AI agent is trained using deep reinforcement learning to defeat the final boss in “Street Fighter II: Special Champion Edition” based solely on the RGB pixel values of the game screen.

ChatDev

ChatDev is one of my personal favorites and one I have the most experience using. It’s a platform that allows users to create customized software using natural language ideas through LLM-powered multi-agent collaboration. It is based on large language models (LLMs) and serves as an ideal scenario for studying collective intelligence. ChatDev is a cutting-edge AI NPC gaming research platform that seamlessly blends numerous advanced model interfaces, allowing for intricate manipulation of NPC interactions within meticulously crafted simulated social settings.

ChatDev has several features, including:

- Customization: ChatDev is highly customizable and extendable.

- Multi-Agent Collaboration: ChatDev's agents form a multi-agent organizational structure and collaborate by participating in specialized functional seminars, including tasks such as designing, coding, testing, and documenting.

- No-Code Creation Tool: ChatDev has a revolutionary no-code creation tool that enables users to design virtual characters and develop games without any coding expertise.

- AI NPC Gaming Research Platform: ChatDev is a cutting-edge AI NPC gaming research platform that allows for intricate manipulation of NPC interactions within meticulously crafted simulated social settings.

ChatDev can be used to create powerful software in minutes with AI agents. It can also be used to build an AI agent workforce for software development. To get started with ChatDev, users can clone the GitHub repository and set up a Python environment.

Here’s a video of my first time using ChatDev

Each of these agents offers a unique set of capabilities and potential applications, setting the stage for a future where AI doesn't just assist us but collaboratively works with us to achieve complex goals. But here is where things get interesting and why I believe there will be an autonomous agent announcement at the upcoming Dev Day.

AutoGen: The Multi-Agent Maestro

AutoGen is a groundbreaking framework developed in collaboration with Microsoft, Penn State University, and the University of Washington, designed to catalyze the development of Large Language Model (LLM) applications. Its architecture enables multi-agent conversations where agents can interact with one another and even integrate human inputs to solve complex tasks. Here are some of its standout features:

- Ease of Development: AutoGen simplifies the often intricate orchestration, automation, and optimization of LLM workflows.

- Customizable and Conversable Agents: AutoGen offers high flexibility in designing conversation patterns, allowing for different levels of autonomy, numbers of agents involved, and agent conversation topology.

- Diverse Applications: AutoGen showcases a variety of working systems, demonstrating its adaptability across domains and complexities.

- Enhanced API Capabilities: It serves as an advanced inference API, providing functionalities like API unification, caching, error handling, and multi-config inference.

This multi-agent framework can transform the way businesses and individuals interact with AI, making it more collaborative, efficient, and versatile than ever before.

Why AutoGen Could Be a Game-Changer for ChatGPT

Given that AutoGen is a result of collaborative research involving Microsoft, a significant backer of OpenAI, and OpenAI's own affirmative stance on autonomous agents, there is compelling reason to believe that AutoGen-like features could be integrated into ChatGPT in the near future. OpenAI executives have taken to Twitter to assert the transformative potential of autonomous agents, reinforcing the likelihood of such an integration.

Is This The Dawn of AGI?

When you combine the autonomous capabilities of AutoGen with the multi-modal functionalities found in research like VideoDirectorGPT, you're not just looking at smarter AI—you're glimpsing the horizon of Artificial General Intelligence (AGI). Imagine a future where you could command an entire team of autonomous agents, each with its unique set of custom instructions. From generating working code and editing videos to even writing and directing AI-generated movies, the possibilities are boundless. ChatDev has already revolutionized the programming world, allowing anyone to generate functional code with a single line of prompting. Now, extrapolate that to multiple domains, and the AGI future doesn't seem that far off.

In essence, we're looking at a future where AI doesn't merely assist but collaboratively works alongside us, augmenting our abilities and even taking on roles we could never have imagined. It's a future where AGI isn't just a distant dream but a tangible possibility.

Prediction #5 GPT-5

OpenAI has been tight-lipped about the development of GPT-5. Even outright dismissive that it’s even working on a larger more capable model. But we all have that tingling feeling in our stomachs that Dev Day just might be the day they peel back the curtain a bit, and put an end to all of the Google Gemini hoopla which they claim will be the ChatGPT killer. So far, IMO, Google has continued to fall short of its promises and often rushes its products to the point they rarely ever work well out of the gate. I guess time will tell. If OpenAI isn’t working on larger models, which I find hard to believe with it now being multimodal, then are autonomous agents what separate GPT-4V and GPT-5? Who knows, but I truly believe we will all get a much better picture come November 6th.

With the dawn of autonomous agents, larger, and smaller more capable models rolling out every day, it’s not hard to see that what most people would call AGI, is not very far off from being a reality. It’s because of this, that I truly believe that we will be comfortable calling GPT-5 or GPT-6 an acceptable answer to what AGI is and can do. The question is, are we ready as a society for what this all truly means in the next 2-3 years? If not the next 2-3 then surely within the next 5. As someone who is paying very close attention to research and news, things are advancing faster than even the top researchers can keep up with. I feel confident in my assessments and predictions no matter how outlandish they may seem now.

The real question is what does our society look like when Humans Need Not Apply? We’ve heard it all before, each technological revolution displaces a lot of jobs but creates more jobs. For now, for at least the next 5 years I think most people are safe, especially if they start reskilling and upskilling and learning how to use generative AI tools for work. If you aren’t already, you’re pretty crazy not to. However, outside of a few select positions or blue-collar skillsets that we have to wait for robotics to catch up, which they are btw, no one can point to why or how most of us will be useful. If AGI is self-improving, writes and corrects its own code, learns from its mistakes, and constantly evolves itself to become more efficient, what’s really left for us to do? The answer is not much. This is where a massive psychological and societal shift must happen. When our identities are no longer tied to our careers and work but to our thoughts and expressions and who we choose to be as humans, as people, in this new and advanced world.

So the question remains…….

Are We Truly Ready for AGI? Can We Be?

If we're honest, society is still grappling with the current wave of narrow AI. Widespread acceptance and understanding of AGI's implications are still embryonic at best. While the tech-savvy are eager for the next iteration of AI, many are still coming to terms with data privacy issues, job displacement, and the ethical dilemmas AI presents.

The Universal Basic Income (UBI) Conundrum

With AGI potentially automating a significant chunk of the workforce, the concept of UBI isn't just a philosophical debate anymore—it's an impending necessity. The question isn't if but when and how we'll implement it. UBI could serve as a buffer, giving people the financial freedom to upskill, reskill, or even pursue more creative, humanitarian, or intellectual endeavors. However, the challenge lies in its funding, governance, and equitable distribution.

Building a Utopian Society

In a world where AGI can cater to our every need, the concept of a utopian society isn't far-fetched. However, it requires a seismic shift in societal values, structures, and norms. A world of abundance could either lead to collective elevation or exacerbate existing inequalities. The choice is ours to make, and it starts with responsible AI development and deployment. Some say I’m a dreamer, but I’m not the only one.

Current Challenges

1. Data Privacy: The more capable AI becomes, the more data it needs, potentially jeopardizing user privacy.

2. Job Displacement: Automation is already a concern, and AGI could accelerate this trend.

3. Ethical Dilemmas: From biased algorithms to AI in warfare, the ethical questions are mounting.

4. Regulatory Framework: A clear policy on AI usage, safety, and ethical considerations is still lacking.

Future Challenges

1. Human-AI Interaction: As AGI approaches, the line between AI and human intelligence blurs, leading to unique psychological and social challenges.

2. Security Risks: More capable AI can also mean more capable cyber-attacks.

3. AI Governance: Who controls AGI, and how do we ensure it aligns with human values?

Ideas & Solutions

1. AI Education: A public awareness and education campaign on AGI's impacts.

2. Multi-Stakeholder Governance: Include experts from various fields in AI policy-making.

3. AI Ethics Committees: Establish independent bodies to review AI development and implementation.

My Goal: A Seat at the Table

As someone deeply embedded in both the practical and theoretical aspects of AI, my goal is to be a part of the policy and regulation conversation. It's crucial to bridge the gap between what's happening in the tech labs and the halls of governance. I aim to bring a nuanced, multi-disciplinary perspective to the table, ensuring that as we advance towards AGI, we do so responsibly, ethically, and inclusively.

About Fraction AI Consulting, Aideations, and the Author

Fraction AI Consulting is a leader in AI training and consulting, specializing in equipping teams with the skills to utilize AI for long-term productivity and profitability. Our newsletter arm, Aideations, serves as a critical platform for thought leadership in the AI space, keeping readers informed about the latest advancements and ethical considerations.

Brent Moreno, the Founder and CEO, stands at the intersection of experiential & brand marketing and generative AI. With over a decade of experience in marketing and a prolific portfolio of over 150 articles on generative AI, Brent is a sought-after authority in both fields. His unique "over-the-shoulder" approach to consulting merges hands-on services with live, on-site training, providing transformative solutions and invaluable education.

Conclusion

The road to AGI is fraught with challenges and opportunities. While the tech advancements are exhilarating, they also bring forth existential questions that we can't afford to ignore. The future is a blank canvas, and AGI gives us an unprecedented palette of colors. But what we paint is up to us. Let's make it a masterpiece.

This wraps up my analysis, predictions, and insights into the rapidly evolving world of AI and AGI. The clock is ticking, and as we approach pivotal moments like OpenAI's Dev Day, one thing is clear: the future is now, and it's ours to shape.

Your Support Means Everything

If you've gotten even a sliver of value from this special edition of Hittin' Dingers by Aideations, I have a small but significant favor to ask: could you take just two minutes to share it with your friends, family, coworkers, and even your bosses? Your personal recommendation is the highest form of validation for the work that goes into putting this together, and it would mean the world to me.

Your sharing helps not just in growing this newsletter but also in expanding the community around Aideations and Fraction AI Consulting. It's much more valuable to me than asking you for your money. So, if you think what we're doing here matters, and you'd like to see more of it, please do take a moment to hit that share button.

Thank you for your continued support.