Skip to main content Skip to footer

Blog

Baseline: OpenAI Embeddings, Robotics Transformer and Claude

5-MINUTE READ

February 1, 2023

Welcome to the February 2023 edition of Baseline, Accenture Federal Services’ machine learning newsletter. In Baseline, we share insights on important advances in machine learning technologies likely to impact our federal clients.

This month we cover the following topics:

  • OpenAI’s new embedding model improves performance and drives down cost
  • Robotics Transformer 1 brings transformer models to robotics
  • Claude addresses harmful outputs with constitutional AI

Click here to subscribe to email updates: Receive Baseline every month in your inbox and stay up to date on the latest advancements in machine learning.

OpenAI’s new embedding model improves performance and drives down cost

Embedding models have been used in natural language processing to transform text to a numerical representation in order to perform downstream machine learning tasks such as summarization, search, classification, and clustering. OpenAI’s collection of state-of-the-art embedding models, based on the GPT-3 language model, have been available as API endpoints. In December 2022, OpenAI released a new embedding model, text-embedding-ada-002, which replaces five separate embedding models that were previously tuned for specific tasks such as text search, text similarity, and code search.

This more unified model will simplify implementation for ML engineers, who can reuse the same model for many tasks.

Additionally, the new model outperforms Davinci, which was previously the most heavyweight, highest-performing embedding model in their suite. In addition to the performance benefits of the model, it is also 99.8% cheaper than Davinci. Driving down costs while increasing performance and usability means that embedding models will have more production value for natural language processing tasks.

Robotics Transformer 1 brings transformer models to robotics

Experts attribute the recent technological advancements in various machine learning subfields to the transfer of knowledge from large diverse datasets and expressive models that can effectively process large amounts of data. This capability has been demonstrated in several machine learning domains, such as computer vision, natural language processing, and speech recognition. However, this approach had not worked well in the field of robotics, due to a lack of diverse and extensive robotics data, which limited the ability of models to learn from a wide range of experiences. An additional limitation is the lack of scalable and real-time inference models that can effectively generalize learning from such large datasets.

To address these challenges, Google AI researchers found that combining open-ended, task-agnostic training with high-capacity architecture that can process a wide range of robotic data is useful for the development of general robotic models.

As a result, they developed the Robotics Transformer 1 (RT-1), a multi-task model that enables real-time inference. The model was trained on a real-world robotics dataset of 130,000 short clips of robot manipulations collected over 17 months from a fleet of 13 Everyday Robots (EDR), covering more than 700 tasks.

Google AI researchers evaluated the generalization capabilities and performance of the RT-1 model against three baseline models in four categories: seen tasks, unseen tasks, tasks with the addition of distractors and varied backgrounds (robustness), and scenarios requiring a sequence of discrete actions (long-horizon tasks). In all four tasks, RT-1 drastically outperforms previously published imitation-learning-based baseline models.

The RT-1 Robotics transformer is an efficient and scalable model for generating actions in real-world robotics tasks. The RT-1 code is open-sourced and available to the public, allowing this to be a useful tool for scaling robot learning in future research, and the zero-shot learning capability makes it particularly intriguing as a step toward robust robot performance.

A demonstration of RT-1 controlling a robot and performing tasks in multiple real kitchens.

Claude addresses harmful outputs with constitutional AI

ChatGPT made waves when it was released in November 2022 due to its impressive ability to respond to a large breadth of user queries in realistic natural language. While groundbreaking and impressive, many limitations of ChatGPT have been documented, such as indications of bias and inaccurate outputs. Anthropic has built a new model, Claude, which aims to address some of the limitations of ChatGPT, specifically harmful outputs.

Claude is only available as a closed demo right now, but some of the guiding principles of the model have been released. Claude uses a technique called “constitutional AI”, which uses a set of guiding principles to train a feedback model prior to training Claude itself. This is in contrast with ChatGPT, which uses reinforcement learning from human feedback (RLHF).

RLHF trains a reward model prior to the final model which has human users rank the quality of the outputs. However, in constitutional AI, an AI model scores and revises the initial model outputs based on how well it aligns with the guiding principles (the “constitution”).

The constitution principles have not been released, but Anthropic has discussed how constitutional AI aims to minimize harm while maximizing usefulness. (For example, an evasive answer such as “I don’t know” minimizes harm but is useless). Compared to RLHF, one benefit of this approach is that human feedback is not necessary once training is kicked off, maximizing the scalability of the supervised training process. New methods and techniques for minimizing harmful outputs in conversational large language models  will be essential in the widespread adoption of these models.

Accenture Federal Services is a leader in artificial intelligence for the U.S. federal government. Our Machine Learning Center of Excellence, Applied Intelligence Discovery Lab, and Advanced Research Group continually assess, develop, and adapt the world’s most innovative techniques and emerging technologies for mission-critical applications.

WRITTEN BY

Shauna Revay, Ph.D.

Senior Manager – Accenture Federal Services, Machine Learning