How to stream the OpenAI API response

Streaming a response for a chatbot is an optimal UX when working with AI applications. his allows our users to instantly see chat results without having to wait until the end of the completion. ChatGPT has definitely popularized this UI pattern and this is what users have come to expect. The problem, however, is that it can be a bit tricky to set up when using the OpenAI chat completion endpoint. In this tutorial we will provide a minimal example on how to set up OpenAI streaming using React JS and Node JS.

Let’s start with the server side code since we can easily test that with Postman. For this example I have a Node JS and Express server, however, you could just as easily use the API server that Next JS provides you. The only downside of the Next JS server is that there is a timeout limit of 10 seconds on the free tier, which is why I chose to set up my own server.

router.post('/', (req, res) => {
  const response = openai.createChatCompletion(
    {
      model: 'gpt-3.5-turbo',
      stream: true,
      messages: [
        {
          role: 'system',
          content: 'You are an SEO expert.',
        },
        {
          role: 'user',
          content: 'Write a paragraph about no-code tools to build in 2021.',
        },
      ],
    },
    { responseType: 'stream' }
  )

  console.log(response)

  response.then((resp) => {
    resp.data.on('data', (chunk) => {
			// console.log the buffer value
      console.log('chunk: ', chunk)
			
			// this converts the buffer to a string
      const payloads = chunk.toString().split('\n\n')

      console.log('payloads: ', payloads)

      for (const payload of payloads) {
				// if string includes '[DONE]'
        if (payload.includes('[DONE]')) {
          res.end() // Close the connection and return
          return
        }
        if (payload.startsWith('data:')) {
					// remove 'data: ' and parse the corresponding object
          const data = JSON.parse(payload.replace('data: ', ''))
          try {
            const text = data.choices[0].delta?.content
            if (text) {
              console.log('text: ', text)
							// send value of text to the client
              res.write(`${text}`)
            }
          } catch (error) {
            console.log(`Error with JSON.parse and ${payload}.\n${error}`)
          }
        }
      }
    })
  })
})

export default route

On the server side we are using the Node JS library for OpenAI and the gpt-3.5-turbo model. We set the responseType to ‘stream’ and then process the buffer from the readable stream. We convert the buffer to a string and loop over the string. We then further process the string to get the value of the text and send it to the client using res.write. See the comments in the code for a detailed explanation.

We can also console.log each part of the readable stream to get a better sense of the data that is being read and eventually sent back to the client.

Screen_Shot_2023-05-29_at_8.30.14_PM.png

On the client side we have a simple React JS component that has a button and when the user clicks the button it makes a post request to the API on our server. The response is then read and the value of the state is updated.

Screen_Shot_2023-05-29_at_8.32.20_PM.png

Here is the full client side code. Notice that we don’t use any dependencies. We rely on the native fetch and the browser API TextDecoderStream to read the stream. For more information on reading streams in the browser check out this article from Mozilla.

'use client'

import React, { useState } from 'react'

export default function Home() {
  const [value, setvalue] = useState('')

  const handleClick = async () => {
    const response = await fetch(
      'http://localhost:5001/api/completion/completion',
      {
        method: 'POST',
        headers: {
          'Content-Type': 'text/event-stream',
        },
      }
    )
    const reader = response.body
      .pipeThrough(new TextDecoderStream())
      .getReader()
    while (true) {
      const { value, done } = await reader.read()
      if (done) break
      console.log('Received: ', value)
      setvalue((prev) => prev + value)
    }
  }

  return (
    <main className={styles.main}>
      <p>Streaming response:</p>
      <br />
      <div style={{ whiteSpace: 'pre-wrap' }}>{value}</div>
      <button colorScheme='whatsapp' onClick={handleClick}>
        Submit
      </button>
    </main>
  )
}

I hope this guide was helpful to you in setting up streaming using OpenAI’s chat completion endpoint. If you are interested in building your own AI app without having to set up a server, database, or frontend check out our no-code solution Typeblock.