Skip to content

I wrote a swift app in under an hour

Can I create an app in under an hour that converts long form spoken audio to bullet points for texting? Let's find out!

Preface

I wanted an app that could take long form spoken audio and convert it to bullet points for texting.

Unfortunately, i am not an ios dev. But fortunately, gpt-4 is!

Check out my journey below or check the repo here:

GitHub - bramses/i-aint-reading-all-that: voice record > whisper > gpt > copy to clipboard
voice record > whisper > gpt > copy to clipboard. Contribute to bramses/i-aint-reading-all-that development by creating an account on GitHub.

It pipes recorded audio to the Whisper API then to ChatGPT to summarize. Then the user can click a button to copy to clipboard. Simple!

Chat Log

---
system_commands: ['I am a generalist that is not afraid to make radical associations between distinct topics by using creative leaps.']
temperature: 0.0
top_p: 1
max_tokens: 1000
presence_penalty: 0
frequency_penalty: 0
stream: true
stop: null
n: 1
model: gpt-4
---

write a ios app that calls the whisper api (1) and then calls the chat api (2) to turn this prompt "summarize this text into bullet points" and uses swiftui to make it look nice

(1)

curl --request POST \
  --url https://api.openai.com/v1/audio/transcriptions \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/path/to/file/openai.mp3 \
  --form model=whisper-1

(2)

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

GPT:

To create an iOS app that calls the Whisper API and Chat API, and uses SwiftUI to display the results, follow these steps:

  1. Create a new SwiftUI project in Xcode.
  2. Install Alamofire for making API requests: Add Alamofire to your Package.swift file or use Swift Package Manager to add it to your project.
  3. Create a new Swift file called APIManager.swift to handle API requests.
  4. Add the following code to APIManager.swift:
import Foundation
import Alamofire

class APIManager {
    static let shared = APIManager()
    private let openAIURL = "https://api.openai.com/v1"
    private let apiKey = "YOUR_API_KEY"

    private init() {}

    func transcribeAudio(fileURL: URL, completion: @escaping (Result<String, Error>) -> Void) {
        let headers: HTTPHeaders = [
            "Authorization": "Bearer \(apiKey)"
        ]

        let url = "\(openAIURL)/audio/transcriptions"

        AF.upload(multipartFormData: { multipartFormData in
            multipartFormData.append(fileURL, withName: "file")
            multipartFormData.append("whisper-1".data(using: .utf8)!, withName: "model")
        }, to: url, headers: headers).responseJSON { response in
            switch response.result {
            case .success(let value):
                if let json = value as? [String: Any], let transcription = json["transcription"] as? String {
                    completion(.success(transcription))
                } else {
                    completion(.failure(NSError(domain: "", code: -1, userInfo: nil)))
                }
            case .failure(let error):
                completion(.failure(error))
            }
        }
    }

    func chatAPI(prompt: String, completion: @escaping (Result<String, Error>) -> Void) {
        let headers: HTTPHeaders = [
            "Authorization": "Bearer \(apiKey)",
            "Content-Type": "application/json"
        ]

        let url = "\(openAIURL)/chat/completions"
        let parameters: [String: Any] = [
            "model": "gpt-3.5-turbo",
            "messages": <a href="https://bram-adams.ghost.io/"role": "user", "content": prompt">"role": "user", "content": prompt</a>
        ]

        AF.request(url, method: .post, parameters: parameters, encoding: JSONEncoding.default, headers: headers).responseJSON { response in
            switch response.result {
            case .success(let value):


Bram:

keep going (also add a record button in the ui for users to record audio in the app to be sent off, and add a env where users can safely upload their api key)


GPT:

Continuing from the previous code, complete the chatAPI function and add a new function to convert the text into bullet points:

                if let json = value as? [String: Any], let choices = json["choices"] as? <a href="https://bram-adams.ghost.io/String: Any">String: Any</a>, let firstChoice = choices.first, let text = firstChoice["text"] as? String {
                    completion(.success(text))
                } else {
                    completion(.failure(NSError(domain: "", code: -1, userInfo: nil)))
                }
            case .failure(let error):
                completion(.failure(error))
            }
        }
    }

    func summarizeToBulletPoints(text: String, completion: @escaping (Result<String, Error>) -> Void) {
        chatAPI(prompt: "Summarize this text into bullet points: \(text)", completion: completion)
    }
}

Now, create a new SwiftUI view called ContentView.swift and add the following code:

import SwiftUI
import AVFoundation

struct ContentView: View {
    @State private var isRecording = false
    @State private var audioRecorder: AVAudioRecorder!
    @State private var bulletPoints: String = ""

    var body: some View {
        VStack {
            Text("Bullet Points:")
                .font(.title)
            ScrollView {
                Text(bulletPoints)
            }
            .padding()

            Button(action: {
                if isRecording {
                    stopRecording()
                } else {
                    startRecording()
                }
                isRecording.toggle()
            }) {
                Text(isRecording ? "Stop Recording" : "Start Recording")
                    .padding()
                    .background(isRecording ? Color.red : Color.blue)
                    .foregroundColor(.white)
                    .clipShape(Capsule())
            }
            .padding()
        }
    }

    func startRecording() {
        let audioFilename = getDocumentsDirectory().appendingPathComponent("recording.m4a")

        let settings = [
            AVFormatIDKey: Int(kAudioFormatMPEG4AAC),
            AVSampleRateKey: 12000,
            AVNumberOfChannelsKey: 1,
            AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue
        ]

        do {
            audioRecorder = try AVAudioRecorder(url: audioFilename, settings: settings)
            audioRecorder.record()
        } catch {
            print("Could not start recording")
        }
    }

    func stopRecording() {
        audioRecorder.stop()

        APIManager.shared.transcribeAudio(fileURL: audioRecorder.url) { result in
            switch result {
            case .success(let transcription):
                APIManager.shared.summarizeToBulletPoints(text: transcription) { result in
                    switch result {
                    case .success(let summary):
                        DispatchQueue.main.async {
                            bulletPoints = summary
                        }
                    case .failure(let error):
                        print("Error summarizing text: \(error)")
                    }
                }
            case .failure(let error):
                print("Error transcribing audio: \(error)")
            }
        }
    }

    func getDocumentsDirectory() -> URL {
        let paths = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)
        return paths[0]
    }
}

struct ContentView_Previews: PreviewProvider {
    static var previews: some View {
        ContentView()
    }
}

This code adds a record button to the UI and allows users to record audio in the app. The recorded audio is then sent to the Whisper API for transcription and the Chat API for summarization.

To securely store the API key, use the Keychain Services API. You can use the KeychainSwift library to simplify the process. Add KeychainSwift to your Package.swift file or use Swift Package Manager to add it to your project.

Then, in APIManager.swift, replace the apiKey constant with the following code:

private let keychain = KeychainSwift()
private var apiKey: String {
    get {
        return keychain.get("openai_api_key") ?? ""
    }
    set {
        keychain.set(newValue, forKey: "openai_api_key")
    }
}

Now, users can securely store their API key in the app's keychain. To set the API key, simply call APIManager.shared.apiKey = "YOUR_API_KEY".

Remember to request microphone access in your Info.plist by adding the NSMicrophoneUsageDescription key with a description of why your app needs microphone access.


Bram:

can you edit the ui to copy the resulting text with a one button click to clipboard


GPT:

Sure! You can add a "Copy to Clipboard" button to the UI that copies the resulting text with a single click. Update the ContentView.swift file with the following changes:

import SwiftUI
import AVFoundation

struct ContentView: View {
    @State private var isRecording = false
    @State private var audioRecorder: AVAudioRecorder!
    @State private var bulletPoints: String = ""

    var body: some View {
        VStack {
            Text("Bullet Points:")
                .font(.title)
            ScrollView {
                Text(bulletPoints)
            }
            .padding()

            Button(action: {
                if isRecording {
                    stopRecording()
                } else {
                    startRecording()
                }
                isRecording.toggle()
            }) {
                Text(isRecording ? "Stop Recording" : "Start Recording")
                    .padding()
                    .background(isRecording ? Color.red : Color.blue)
                    .foregroundColor(.white)
                    .clipShape(Capsule())
            }
            .padding()

            Button(action: {
                UIPasteboard.general.string = bulletPoints
            }) {
                Text("Copy to Clipboard")
                    .padding()
                    .background(Color.green)
                    .foregroundColor(.white)
                    .clipShape(Capsule())
            }
            .padding()
        }
    }

    // ... (rest of the code remains the same)
}

This code adds a "Copy to Clipboard" button below the recording button. When clicked, it copies the bullet points text to the clipboard. Users can then paste the text into other apps or documents.


Bram:

how do i create a package file for alamofire


GPT:

To add Alamofire as a dependency to your project using Swift Package Manager, you don't need to create a package file specifically for Alamofire. Instead, you can add Alamofire to your existing Package.swift file or directly through Xcode.

Here's how to add Alamofire using Xcode:

  1. Open your project in Xcode.
  2. In the menu bar, click on File > Add Packages….
  3. In the search bar, type Alamofire and press Enter.
  4. Select the Alamofire package from the search results (it should be the first result, by Alamofire).
  5. Click on Add Package.

Xcode will automatically add Alamofire as a dependency to your project and update your Package.swift file.

If you prefer to manually edit your Package.swift file, follow these steps:

  1. Open your project's Package.swift file.
  2. Add Alamofire to the dependencies array:
dependencies: [
    .package(url: "https://github.com/Alamofire/Alamofire.git", .upToNextMajor(from: "5.5.0"))
],
  1. Add Alamofire to the targets section, under the dependencies array of your app target:
targets: [
    .target(
        name: "YourAppName",
        dependencies: [
            .product(name: "Alamofire", package: "Alamofire")
        ]),
    // ...
],

Replace YourAppName with the name of your app target.

After making these changes, Alamofire will be added as a dependency to your project, and you can use it by importing Alamofire in your Swift files.


Bram:

SchemeBuildError: Failed to build the scheme "i-aint-reading-all-that"

cannot find 'KeychainSwift' in scope


GPT:

It seems like you're trying to use the KeychainSwift library, but it hasn't been added to your project yet. To add KeychainSwift using Swift Package Manager, follow these steps:

  1. Open your project in Xcode.
  2. In the menu bar, click on File > Add Packages….
  3. In the search bar, type KeychainSwift and press Enter.
  4. Select the KeychainSwift package from the search results (it should be the first result, by evgenyneu).
  5. Click on Add Package.

Xcode will automatically add KeychainSwift as a dependency to your project and update your Package.swift file.

If you prefer to manually edit your Package.swift file, follow these steps:

  1. Open your project's Package.swift file.
  2. Add KeychainSwift to the dependencies array:
dependencies: [
    .package(url: "https://github.com/evgenyneu/keychain-swift.git", .upToNextMajor(from: "23.0.0"))
],
  1. Add KeychainSwift to the targets section, under the dependencies array of your app target:
targets: [
    .target(
        name: "YourAppName",
        dependencies: [
            .product(name: "KeychainSwift", package: "keychain-swift")
        ]),
    // ...
],

Replace YourAppName with the name of your app target.

After making these changes, KeychainSwift will be added as a dependency to your project, and you can use it by importing KeychainSwift in your Swift files.


Bram:

i need some kind of input where users can paste their api key, and have a eye button next to it that show/hides it


GPT:

To create a text input field for users to paste their API key and an eye button to show/hide the input, you can modify the ContentView.swift file as follows:

import SwiftUI
import AVFoundation
import KeychainSwift

struct ContentView: View {
    @State private var isRecording = false
    @State private var audioRecorder: AVAudioRecorder!
    @State private var bulletPoints: String = ""
    @State private var apiKey: String = ""
    @State private var isApiKeyVisible = false

    private let keychain = KeychainSwift()

    var body: some View {
        VStack {
            HStack {
                TextField("Paste your API key here", text: $apiKey, onCommit: {
                    keychain.set(apiKey, forKey: "openai_api_key")
                })
                .textFieldStyle(RoundedBorderTextFieldStyle())
                .disableAutocorrection(true)
                .autocapitalization(.none)
                .textContentType(.password)
                .secureFieldStyle(isApiKeyVisible)

                Button(action: {
                    isApiKeyVisible.toggle()
                }) {
                    Image(systemName: isApiKeyVisible ? "eye.slash.fill" : "eye.fill")
                        .foregroundColor(.blue)
                }
            }
            .padding()

            // ... (rest of the code remains the same)
        }
    }

    // ... (rest of the code remains the same)

    private func secureFieldStyle(_ visible: Bool) -> some ViewModifier {
        if visible {
            return AnyViewModifier(EmptyModifier())
        } else {
            return AnyViewModifier(SecureFieldModifier())
        }
    }
}

private struct SecureFieldModifier: ViewModifier {
    func body(content: Content) -> some View {
        content
            .textContentType(.password)
            .keyboardType(.asciiCapable)
    }
}

private struct AnyViewModifier: ViewModifier {
    let modifier: Any

    init<M: ViewModifier>(_ modifier: M) {
        self.modifier = modifier
    }

    func body(content: Content) -> some View {
        content.modifier(modifier)
    }
}

struct ContentView_Previews: PreviewProvider {
    static var previews: some View {
        ContentView()
    }
}

This code adds a text input field for users to paste their API key and an eye button to show/hide the input. The API key is saved to the keychain when the user presses Enter or taps outside the text field. The eye button toggles the visibility of the API key.

Make sure you have imported the KeychainSwift library at the beginning of the ContentView.swift file.


Bram:

SchemeBuildError: Failed to build the scheme "i-aint-reading-all-that"

value of type 'some View' has no member 'secureFieldStyle'


GPT:

Apologies for the confusion. It seems I made a mistake in the code. Please replace the secureFieldStyle(_:) function and the custom ViewModifier structs with the following updated code:

@ViewBuilder
private func secureFieldStyle(_ content: TextField<Text>, visible: Bool) -> some View {
    if visible {
        content
    } else {
        content
            .textContentType(.password)
            .keyboardType(.asciiCapable)
    }
}

Then, update the body property of the VStack to use the new secureFieldStyle(_:visible:) function:

VStack {
    HStack {
        let textField = TextField("Paste your API key here", text: $apiKey, onCommit: {
            keychain.set(apiKey, forKey: "openai_api_key")
        })
        .textFieldStyle(RoundedBorderTextFieldStyle())
        .disableAutocorrection(true)
        .autocapitalization(.none)

        secureFieldStyle(textField, visible: isApiKeyVisible)

        Button(action: {
            isApiKeyVisible.toggle()
        }) {
            Image(systemName: isApiKeyVisible ? "eye.slash.fill" : "eye.fill")
                .foregroundColor(.blue)
        }
    }
    .padding()

    // ... (rest of the code remains the same)
}

This updated code should work without any issues. The secureFieldStyle(_:visible:) function now takes a TextField<Text> as its first argument and a Bool for visibility as its second argument. The function uses the @ViewBuilder attribute to conditionally apply the .textContentType(.password) and .keyboardType(.asciiCapable) modifiers.


Bram:

getting

Type 'Any' cannot conform to 'ViewModifier'

for:

**func** body(content: Content) -> **some** View {

        content.modifier(modifier)

    }

GPT:

I apologize for the confusion in my previous response. It seems I made another mistake in the code. Let's simplify the solution by removing the custom ViewModifier structs and the secureFieldStyle(_:) function. Instead, we can use a single if statement to conditionally display the TextField or a SecureField based on the isApiKeyVisible state.

Update the body property of the VStack as follows:

VStack {
    HStack {
        if isApiKeyVisible {
            TextField("Paste your API key here", text: $apiKey, onCommit: {
                keychain.set(apiKey, forKey: "openai_api_key")
            })
            .textFieldStyle(RoundedBorderTextFieldStyle())
            .disableAutocorrection(true)
            .autocapitalization(.none)
        } else {
            SecureField("Paste your API key here", text: $apiKey, onCommit: {
                keychain.set(apiKey, forKey: "openai_api_key")
            })
            .textFieldStyle(RoundedBorderTextFieldStyle())
            .disableAutocorrection(true)
            .autocapitalization(.none)
        }

        Button(action: {
            isApiKeyVisible.toggle()
        }) {
            Image(systemName: isApiKeyVisible ? "eye.slash.fill" : "eye.fill")
                .foregroundColor(.blue)
        }
    }
    .padding()

    // ... (rest of the code remains the same)
}

This code uses an if statement to display either a TextField or a SecureField based on the isApiKeyVisible state. The eye button toggles the visibility of the API key by changing the isApiKeyVisible state.


Bram:

still getting Type 'Any' cannot conform to 'ViewModifier' can you just rewrite ContentView from scratch


GPT:

I apologize for the confusion. Here's the updated ContentView.swift file without the custom ViewModifier structs and with a simplified solution for showing and hiding the API key:

import SwiftUI
import AVFoundation
import KeychainSwift

struct ContentView: View {
    @State private var isRecording = false
    @State private var audioRecorder: AVAudioRecorder!
    @State private var bulletPoints: String = ""
    @State private var apiKey: String = ""
    @State private var isApiKeyVisible = false

    private let keychain = KeychainSwift()

    var body: some View {
        VStack {
            HStack {
                if isApiKeyVisible {
                    TextField("Paste your API key here", text: $apiKey, onCommit: {
                        keychain.set(apiKey, forKey: "openai_api_key")
                    })
                    .textFieldStyle(RoundedBorderTextFieldStyle())
                    .disableAutocorrection(true)
                    .autocapitalization(.none)
                } else {
                    SecureField("Paste your API key here", text: $apiKey, onCommit: {
                        keychain.set(apiKey, forKey: "openai_api_key")
                    })
                    .textFieldStyle(RoundedBorderTextFieldStyle())
                    .disableAutocorrection(true)
                    .autocapitalization(.none)
                }

                Button(action: {
                    isApiKeyVisible.toggle()
                }) {
                    Image(systemName: isApiKeyVisible ? "eye.slash.fill" : "eye.fill")
                        .foregroundColor(.blue)
                }
            }
            .padding()

            Text("Bullet Points:")
                .font(.title)
            ScrollView {
                Text(bulletPoints)
            }
            .padding()

            Button(action: {
                if isRecording {
                    stopRecording()
                } else {
                    startRecording()
                }
                isRecording.toggle()
            }) {
                Text(isRecording ? "Stop Recording" : "Start Recording")
                    .padding()
                    .background(isRecording ? Color.red : Color.blue)
                    .foregroundColor(.white)
                    .clipShape(Capsule())
            }
            .padding()

            Button(action: {
                UIPasteboard.general.string = bulletPoints
            }) {
                Text("Copy to Clipboard")
                    .padding()
                    .background(Color.green)
                    .foregroundColor(.white)
                    .clipShape(Capsule())
            }
            .padding()
        }
    }

    // ... (rest of the code remains the same)
}

struct ContentView_Previews: PreviewProvider {
    static var previews: some View {
        ContentView()
    }
}

This updated ContentView uses an if statement to display either a TextField or a SecureField based on the isApiKeyVisible state. The eye button toggles the visibility of the API key by changing the isApiKeyVisible state.


Bram:

getting error:

**Error transcribing audio: Error Domain= Code=-1 "(null)"**


GPT:

The error you're encountering might be due to an incorrect response handling in the transcribeAudio(fileURL:completion:) function in the APIManager.swift file. The Whisper API response structure is different from what the function expects.

To fix this issue, update the transcribeAudio(fileURL:completion:) function in APIManager.swift as follows:

func transcribeAudio(fileURL: URL, completion: @escaping (Result<String, Error>) -> Void) {
    let headers: HTTPHeaders = [
        "Authorization": "Bearer \(apiKey)"
    ]

    let url = "\(openAIURL)/audio/transcriptions"

    AF.upload(multipartFormData: { multipartFormData in
        multipartFormData.append(fileURL, withName: "file")
        multipartFormData.append("whisper-1".data(using: .utf8)!, withName: "model")
    }, to: url, headers: headers).responseJSON { response in
        switch response.result {
        case .success(let value):
            if let json = value as? [String: Any], let data = json["data"] as? [String: Any], let transcription = data["transcription"] as? String {
                completion(.success(transcription))
            } else {
                completion(.failure(NSError(domain: "", code: -1, userInfo: nil)))
            }
        case .failure(let error):
            completion(.failure(error))
        }
    }
}

This updated function now correctly handles the response from the Whisper API by looking for the transcription key inside the data dictionary.

Please note that the Whisper API is in a research preview, and its performance may vary. If you still encounter issues, make sure your API key is correct and has the necessary permissions. Additionally, ensure that the audio file you're trying to transcribe is in a supported format and has clear speech.


Bram:

keychain not updating api key when i paste it in


GPT:

I apologize for the confusion. It seems the onCommit event is not being triggered as expected when pasting the API key. To fix this issue, let's create a separate button to save the API key to the keychain.

Update the body property of the VStack in ContentView.swift as follows:

VStack {
    HStack {
        if isApiKeyVisible {
            TextField("Paste your API key here", text: $apiKey)
                .textFieldStyle(RoundedBorderTextFieldStyle())
                .disableAutocorrection(true)
                .autocapitalization(.none)
        } else {
            SecureField("Paste your API key here", text: $apiKey)
                .textFieldStyle(RoundedBorderTextFieldStyle())
                .disableAutocorrection(true)
                .autocapitalization(.none)
        }

        Button(action: {
            isApiKeyVisible.toggle()
        }) {
            Image(systemName: isApiKeyVisible ? "eye.slash.fill" : "eye.fill")
                .foregroundColor(.blue)
        }
    }
    .padding()

    Button(action: {
        keychain.set(apiKey, forKey: "openai_api_key")
    }) {
        Text("Save API Key")
            .padding()
            .background(Color.orange)
            .foregroundColor(.white)
            .clipShape(Capsule())
    }
    .padding()

    // ... (rest of the code remains the same)
}

This code adds a "Save API Key" button below the API key input field. When clicked, it saves the API key to the keychain. This should resolve the issue with the keychain not updating the API key when pasting it in.


Bram:

if let json = value as? [String: Any], let data = json["data"] as? [String: Any], let transcription = data["text"] as? String {
                    completion(.success(transcription))
                }

is not correctly pulling text from

{
    text = "So now I'm recording. I guess the first time it breaks and I don't know why that is the case, but whatever.";

}

GPT:

I apologize for the confusion. It seems I made a mistake in the response handling code. Based on the response structure you provided, you should access the text key directly from the json dictionary.

Update the transcribeAudio(fileURL:completion:) function in APIManager.swift as follows:

func transcribeAudio(fileURL: URL, completion: @escaping (Result<String, Error>) -> Void) {
    let headers: HTTPHeaders = [
        "Authorization": "Bearer \(apiKey)"
    ]

    let url = "\(openAIURL)/audio/transcriptions"

    AF.upload(multipartFormData: { multipartFormData in
        multipartFormData.append(fileURL, withName: "file")
        multipartFormData.append("whisper-1".data(using: .utf8)!, withName: "model")
    }, to: url, headers: headers).responseJSON { response in
        switch response.result {
        case .success(let value):
            if let json = value as? [String: Any], let transcription = json["text"] as? String {
                completion(.success(transcription))
            } else {
                completion(.failure(NSError(domain: "", code: -1, userInfo: nil)))
            }
        case .failure(let error):
            completion(.failure(error))
        }
    }
}

This updated function now correctly handles the response from the Whisper API by looking for the text key directly inside the json dictionary.


bramadams.dev is a reader-supported published Zettelkasten. Both free and paid subscriptions are available. If you want to support my work, the best way is by taking out a paid subscription.

Comments