Guide for using Aitta library

Introduction

This guide explains how to install and use Aitta client library, available on PyPi, to interact with the Aitta API for your ML inference tasks. Whether you need text completion or advanced chat interactions, this guide will provide a step-by-step approach to help you get started.

Note that both the API as well as the client libary are still under heavy development and while we try to keep changes mostly backwards-compatible, breaking changes may happen. Access to Aitta is currently restricted to selected beta users.

Step 1: Installing the Aitta Client Library

To begin, you need to install the Aitta client library to your Python environment. The library is available on PyPi and can be installed using pip:

pip install aitta-client

Step 2: Setting Up API Access

To interact with the Aitta API, you need an access token:

Go to the Aitta web interface.
Navigate to the model's page, select the "API Key" tab, and press the "Generate API key" button.
Use this key to configure the client library.

With the key, you can create an instance of StaticAccessTokenSource to use with the client library.

Step 3: Using Aitta for Inference

Two scenarios are demonstrated in code examples below:

Using the model to complete a single input
Initiating a chat conversation with the model

In both examples the Aitta client is first configured with the access token (step 2) and API URL. These are used to create a StaticAccessTokenSource, which provides the authentication mechanism for interacting with the model through the Aitta API.

Main client API classes

Client: implements all requests to the Aitta API servers on a low level and is used by all other classes
AccessTokenSource: used by the client to get (and eventually refresh) access tokens
Model: represents a model and provides methods to perform inference
Task: represents an active inference task and provides methods to query the current status and results

Text completion with the LumiOpen/Poro model

This example demonstrates how to provide the model with a single input to generate a completion. The model takes the input text and predicts the most likely continuation based on the given prompt.

from aitta_client import Model, Client, StaticAccessTokenSource

# configure Client instance with access tokeßn and API URL
poro_access_token = "<generate your model-specific token from https://staging-aitta.2.rahtiapp.fi/ and enter it here>"

token_source = StaticAccessTokenSource(poro_access_token)
client = Client("https://api-staging-aitta.2.rahtiapp.fi", token_source)
# load the LumiOpen/Poro model
model = Model.load("LumiOpen/Poro", client)

print(model.description)

# declare inputs and parameters for a text completion inference task
inputs = {
    'input': 'Suomen paras kaupunki on'
}

params = {
    'do_sample': True,
    'max_new_tokens': 20
}

print(f"INPUT:\n{inputs}")

# start the inference and wait for completion
result = model.start_and_await_inference(inputs, params)
print(f"OUTPUT:\n{result}")

OpenAI chat completion with the LumiOpen/Poro-34b-chat model

This example shows how to start a conversation with the model using OpenAI’s chat completion feature.

from aitta_client import Model, Client, StaticAccessTokenSource
import openai

# configure Client instance with access token and API URL
poro_access_token = "<generate your model-specific token from https://staging-aitta.2.rahtiapp.fi/ and enter it here>"

token_source = StaticAccessTokenSource(poro_access_token)
client = Client("https://api-staging-aitta.2.rahtiapp.fi", token_source)

# load the LumiOpen/Poro-34B-chat model
model = Model.load("LumiOpen/Poro-34B-chat", client)

print(model.description)

# configure OpenAI client to use the Aitta OpenAI compatibility endpoints
client = openai.OpenAI(api_key=token_source.get_access_token(), base_url=model.openai_api_url)

# perform chat completion with the OpenAI client
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test"
        }
    ],
    model=model.id,
    stream=False  # response streaming is currently not supported by Aitta
)

print(chat_completion.choices[0].message.content)