Large Language Models (LLMs) are not necessarily a breakthrough technology designed to solve problems that have been previously deemed unsolvable. Rather, much like industrial robots automated manual labor on factory floors, LLMs are a step forward in the automation of manual work in the office environment.
I see roughly three different eras in robotic factory automation that have their counterparts in office work automation. In the early days of factory automation, industrial robots were used to automate individual, repetitive, and highly constrained tasks. In the office environment that corresponds to legacy robotic process automation (RPA), e.g., copy-paste text, move files, etc. With the advancements in machine learning and computer vision, industrial robots gained the ability to understand the environment and plan their action. In the office environment, computer vision and now large language models enable a deeper contextual and semantic understanding of business data. In the future, dextrous mobile robots will replace all manual labor. And in the office, advanced AI agents will orchestrate and execute complex, open-ended tasks that currently require human reasoning.
Continuing the analogy, the common hurdles in adopting factory automation include integration time and complexity, uptime and throughput, system maintenance, and workplace safety. All these concerns have their counterparts in the domain of office RPA: it can be very complex and time-consuming to integrate RPA with legacy software systems, new automation has to be reliable, maintenance has to be cheap and easy, and security and data privacy are crucial.
Using LLMs inside RPA does not necessarily help with any of these issues. In fact, it makes some of them even more challenging. Current LLMs are still unreliable and inconsistent for more complex tasks, fine-tuning them on customer’s data requires additional software and hardware infrastructure, and if using commercial LLMs, data privacy policies depend on the API provider. Moreover, they also introduce novel challenges like transparency, ethics, and compliance. However, as mentioned before, they do unlock opportunities for automating more complex tasks, so in many cases value they bring is higher than the cost incurred for integrating them.
With this understanding, a startup venturing into the RPA space could consider two promising avenues:
- They could create an infrastructure tool specifically designed to address and mitigate some of the challenges highlighted earlier, serving other internal teams like DevOps or external RPA providers such as UiPath.
- Alternatively, they could focus on developing a specialized process automation solution that offers improvements over existing market offerings.
When taking the latter approach, the startup could build an unopinionated, general-purpose, no-code RPA platform. This would enable the customer’s internal teams to handle the actual task automation, providing them with flexibility and control. On the other hand, the startup could craft an end-to-end automation solution tailored to a specific task, offering a more streamlined, ready-to-use solution.
It’s important to note the potential for future growth and expansion irrespective of the initially chosen path. Startups can evolve by including elements of the alternate approach, allowing them to better adapt to changing market demands and customer needs.
Large language models, such as GPT4 and Llama 2, stand on the precipice of a revolution, poised to reimagine office work in the same profound way modern industrial robots have transformed factory operations. However, harnessing their full potential is not without its hurdles, involving both traditional and emergent challenges. Yet, in every challenge lies an opportunity, a silver lining that innovative companies can leverage to carve out their niche in this burgeoning market.