🐍⌨️ Merging OpenAI Function Calls with Pydantic

In recent updates, OpenAI introduced function calls, which, in essence, are a tool for parsing structured data. While it is presented as a means of interacting with APIs, the true value of this feature lies in its ability to produce finely structured output. But how can we best harness this capability? Enter Pydantic - a Python library that offers an expressive and light way to handle data structures.

If you want to just look at the code you can check it out here

The Dynamic Duo: OpenAI Function Calls and Pydantic

OpenAI's function calls mark a significant shift in how we interact with APIs. Instead of guessing how to parse a response, we can now specify function calls and their expected inputs. This, combined with Pydantic's ability to handle data validation, empowers us to structure our conversations with the AI in a robust and efficient manner.

But what about Pydantic? What does it bring to the table?

Pydantic leverages Python type annotations for data validation, ensuring that your data adheres to the correct types, constraints, and formats. It helps in generating JSON schemas, applying additional validation rules, handling errors, and more, all within Python's familiar landscape.

Simplifying OpenAI Function Calls with Pydantic: A Practical Approach

To illustrate this concept, let's consider a complex use-case: splitting a request into multiple search queries.

from openai_function_call import OpenAISchema
from pydantic import Field
from typing import List
from tenacity import retry, stop_after_attempt
import openai
import enum


class SearchType(str, enum.Enum):
    VIDEO = "video"
    EMAIL = "email"


class Search(OpenAISchema):
    """
    Search query for a single request

    Tips:
    - Be specific with your query, use key words and multiple representations of the same thing, e.g. "video" and "video clip" or "SSO" and "single sign on"
    - Use the title to describe the request, e.g. "Video from last week about the investment case study"
    """

    title: str = Field(..., description="Title of the request")
    query: str = Field(..., description="Query to search for relevant content")
    type: SearchType = Field(..., description="Type of search")

    async def execute(self):
        import asyncio

        await asyncio.sleep(1)
        print(
            f"Searching for `{self.title}` with query `{self.query}` using `{self.type}`"
        )


class MultiSearch(OpenAISchema):
    """
    Segment a request into multiple search queries

    Tips:
    - Do not overlap queries, e.g. "video" and "video clip" are too similar
    """

    searches: List[Search] = Field(..., description="List of searches")

    def execute(self):
        import asyncio

        loop = asyncio.get_event_loop()

        tasks = asyncio.gather(*[search.execute() for search in self.searches])
        return loop.run_until_complete(tasks)


@retry(stop=stop_after_attempt(3))
def segment(data: str) -> MultiSearch:
    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo-0613",
        temperature=0,
        functions=[MultiSearch.openai_schema],
        function_call={"name": MultiSearch.openai_schema['name']},
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant.",
            },
            {
                "role": "user",
                "content": f"Consider the data below:\n{data} and segment it into multiple search queries",
            },
        ],
        max_tokens=1000,
    )
    return MultiSearch.from_response(completion)


if __name__ == "__main__":
    queries = segment(
        "Please send me the video from last week about the investment case study and also documents about your GPDR policy?"
    )

    queries.execute()
    # >>> Searching for `Video` with query `investment case study` using `SearchType.VIDEO`
    # >>> Searching for `Documents` with query `GPDR policy` using `SearchType.EMAIL`

This example illustrates how to leverage OpenAI Function Calls and Pydantic to handle complex data. We define Search and MultiSearch schemas using Pydantic, ensuring all our data adhere to the correct types and constraints.

With this setup, the OpenAI API can interpret our queries correctly, and Pydantic validates the output from the API. We've thus created a lightweight yet robust system for handling structured data without relying on heavy abstractions.

The Light Way Forward

The synergy between OpenAI function calls and Pydantic presents a minimalist, Pythonic way to handle structured data. It demonstrates how we can accomplish a lot without relying on heavy-handed frameworks.

Using these tools, you stay in control, understanding every interaction with the underlying API. This minimalist approach not only reduces the complexity but also enhances your productivity.

As we progress on this exciting journey, consider starring the repo here and following me on Twitter at @jxnlco for more insights and updates.

Embrace this light, efficient approach and elevate your data handling experience with OpenAI and Pydantic!