Streaming ChatGPT results to a view using Swift's AsyncSequence

An example of how to stream the text from ChatGPT to an iOS text view using URLSession's bytes(for: delegate:) method.

Michael Edgcumbe · June 17th, 2023 – 1 minute read

After comparing the results from the davinci and curie models from ChatGPT, I decided to stick with davinci and come up with a way to stream the results to my text view. After some brief searching, I could only find implementations of how to stream results using Javascript... nothing came up for iOS.

So I made an example using the AsyncSequence functionality of URLSession's bytes(for: delegate:) method.

The method I'm using to get results from ChatGPT looks for a delegate to stream results to, and if it finds one, it sets the HTTP Body attribute "stream" to true.

public func query(languageGeneratorRequest:LanguageGeneratorRequest, delegate:AssistiveChatHostStreamResponseDelegate? = nil) async throws->NSDictionary? {
    if searchSession == nil {
        searchSession = try await session()
    }

    guard let searchSession = searchSession else {
        throw LanguageGeneratorSessionError.ServiceNotFound
    }

    let components = URLComponents(string: LanguageGeneratorSession.serverUrl)

    guard let url = components?.url else {
        throw LanguageGeneratorSessionError.ServiceNotFound
    }

    var request = URLRequest(url:url)
    request.httpMethod = "POST"
    request.setValue("application/json", forHTTPHeaderField: "Content-Type")
    request.setValue("Bearer \(self.openaiApiKey)", forHTTPHeaderField: "Authorization")

    var body:[String : Any] = ["model":languageGeneratorRequest.model, "prompt":languageGeneratorRequest.prompt, "max_tokens":languageGeneratorRequest.maxTokens,"temperature":languageGeneratorRequest.temperature]

    if let stop = languageGeneratorRequest.stop {
        body["stop"] = stop
    }

    if let user = languageGeneratorRequest.user {
        body["user"] = user
    }

    if delegate != nil {
        body["stream"] = true
    }

    let jsonData = try JSONSerialization.data(withJSONObject: body)
    request.httpBody = jsonData

    if let delegate = delegate {
        return try await fetchBytes(urlRequest: request, apiKey: self.openaiApiKey, session: searchSession, delegate: delegate)
    } else {
        return try await fetch(urlRequest: request, apiKey: self.openaiApiKey) as? NSDictionary
    }
}

If the streaming function is called, it uses the version of URLSession that returns bytes asynchronously.

internal func fetchBytes(urlRequest:URLRequest, apiKey:String, session:URLSession, delegate:AssistiveChatHostStreamResponseDelegate) async throws -> NSDictionary? {
    print("Requesting URL: \(String(describing: urlRequest.url))")
    let (bytes, response) = try await session.bytes(for: urlRequest)

    guard let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 else {
        throw LanguageGeneratorSessionError.InvalidServerResponse
    }

    let retVal = NSMutableDictionary()
    var fullString = ""

    for try await line in bytes.lines {
        if line == "data: [DONE]" {
            break
        }

        guard let d = line.dropFirst(5).data(using: .utf8) else {
            throw LanguageGeneratorSessionError.InvalidServerResponse
        }
        guard let json = try JSONSerialization.jsonObject(with: d) as? NSDictionary else {
            throw LanguageGeneratorSessionError.InvalidServerResponse
        }
        guard let choices = json["choices"] as? [NSDictionary] else {
            throw LanguageGeneratorSessionError.InvalidServerResponse
        }

        if let firstChoice = choices.first, let text = firstChoice["text"] as? String {
            fullString.append(text)
            delegate.didReceiveStreamingResult(with: text)
        }
    }

    fullString = fullString.trimmingCharacters(in: .whitespaces)
    retVal["text"] = fullString
    return retVal
}

Most of the code above looks for the partial response and extracts the text from the pseudo-dictionary returned. The response looks like this:

data: {"id":"cmpl-7SpPLxKO1IrAdgQ2ztKdynpa4MSp1","object":"text_completion","created":1687105151,"choices":[{"text":" sushi","index":0,"logprobs":null,"finish_reason":null}],"model":"text-davinci-003"}

The function also gathers the full response into a string and returns that in a dictionary of its own... although that goes unused in my code as of right now. I intend to cache these responses in iCloud at some point, so it makes sense to do both in the long run.

ChatGPT

URLSession

Streaming