Structured Output

It demonstrates how gollm can be used to extract structured information from unstructured text, with options for validation and concurrent processing. This is particularly useful for tasks like information extraction, data parsing, and automated data entry from natural language inputs.

Let's start with the structured_data_extractor.go file:

  1. ExtractStructuredData Function: This is the main function for extracting structured data:

    func ExtractStructuredData[T any](ctx context.Context, l LLM, text string, opts ...PromptOption) (*T, error) {

    It's a generic function that can extract data into any struct type T.

  2. JSON Schema Generation: The function generates a JSON schema based on the provided struct type:

    schema, err := generateJSONSchemaFromStruct(structType, l)
  3. Prompt Creation: It creates a prompt that includes the input text and the generated JSON schema:

    prompt := NewPrompt(
        fmt.Sprintf("Extract the following information from the given text:\n\n%s\n\nRespond with a JSON object matching this schema:\n%s", text, schema),
        // ... directives and options ...
    )
  4. LLM Generation and Parsing: The function generates a response using the LLM and parses it into the struct:

    response, err := l.Generate(ctx, prompt, WithJSONSchemaValidation())
    // ... error handling ...
    var result T
    if err := json.Unmarshal([]byte(response), &result); err != nil {
        // ... error handling ...
    }
  5. Validation: The extracted data is validated against the struct's validation rules:

    if err := Validate(result); err != nil {
        // ... error handling ...
    }

Now, let's look at the example usage in the main function:

  1. Struct Definitions: Two struct types are defined: MovieReview (without validation tags) and MovieReviewValidated (with validation tags).

  2. LLM Client Initialization: An LLM client is created using the Groq provider:

    llm, err := gollm.NewLLM(
        gollm.SetProvider("groq"),
        gollm.SetModel("llama-3.1-70b-versatile"),
        // ... other settings ...
    )
  3. Text Input: A sample movie review text is defined.

  4. Concurrent Extraction: The example demonstrates concurrent extraction of structured data with and without validation:

    go func() {
        review, err := extractReview[MovieReview](ctx, llm, text, false)
        // ... handle result ...
    }()
    
    go func() {
        reviewValidated, err := extractReview[MovieReviewValidated](ctx, llm, text, true)
        // ... handle result ...
    }()
  5. Result Handling: The results are collected and printed using channels and a select statement.

  6. Pretty Printing: The printReview function is used to display the extracted data in a readable format.

This example showcases several advanced features of gollm:

  • Generic structured data extraction based on Go struct definitions

  • Automatic JSON schema generation from struct types

  • Concurrent processing of multiple extraction tasks

  • Handling of validation rules in struct tags

  • Use of channels for asynchronous result collection

  • Pretty printing of structured data

Last updated