Hi everybody, I find a huge part of my job is talking to colleagues and clients and at the end of those phone calls, I have to write a summary of what happened, plus any key points that I need to focus on followup.

I figured it would be an excellent task for a LLM.

It would need intercept the phone call dialogue, and transcribe the dialogue.

Then afterwards I would want to summarize it.

I’m not talking about teams meetings or anything like that, I’m talking a traditional phone call, via a mobile phone to another phone.

I understand that that could be two different pieces of software, and that would be fine, but I am wondering if there is any such tool out there, or a tool in the making?

If you have any leads, I’d love to hear them.

Thank you so much

  • Audalin@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    4 months ago

    Haven’t heard of all-in-one solutions, but once you have a recording, whisper.cpp can do the transcription:

    The underlying Whisper models are MIT.

    Then you can use any LLM inference engine, e.g. llama.cpp, and ask the model of your choice to summarise the transcript:

    You can also write a small bash/python script to make the process a bit more automatic.

    • makingStuffForFunOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      4 months ago

      Okay, the idea is excellent, but I’ve just spent the last hour trying to get any app out there to record my calls.

      I’ve tried the open source one on f droid, and it almost works. I can get it to record my side, but that’s it.

      I tried commercial ones. I tried commercial ones with horrendous privacy policies. Nothing seems to work.

      I’ve used the accessibility options. I’ve gone deep down into the rabbit hole, so it looks like Android is fully cutting off the ability to record calls. In Australia at least.

      What a shame.

      These apps all seem to have the same ability of dropping the recording into a folder, so I could synchronize that across my network, have my server check for new files that appear into that folder, and then the LLM could convert that into a text file and send it straight back to me.

      Living the dream! But… Not

      • Audalin@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        4 months ago

        I expected that recording would be the hard part.

        I think some of the open-source ones should work if your phone is rooted?

        I’ve heard that Google’s phone app can record calls (though it says it aloud when starting the recording). Of course, it wouldn’t work if Google thinks it shouldn’t in your region.

        By the way, Bluetooth headphones can have both speakers and a microphone. And Android can’t tell a peripheral device what it should or shouldn’t do with audio streams. Sounds like a fun DIY project if you’re into it, or maybe somebody sells these already.

        • tomjuggler@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 months ago

          I was just thinking that, a raspberry pi working as a Bluetooth peripheral and some python code would work?

      • MalReynolds@slrpnk.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        4 months ago

        Basically depends on if it’s legal to unilaterally record in your jurisdiction, if not the apps won’t be made available on pain of lawsuit. Nothing stopping you using speakerphone and recording with something else tho. Again, depending on jurisdiction, you may need to CYA with ‘your calls may be recorded for training purpose’. Note that unilaterally recording (i.e. without notifying the other party) is often a felony.

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        4 months ago

        I managed to get the open source call recorder working on a rooted LineageOS device. Seems it doesn’t work without root.

        Beware - recording people without their explicit consent is illegal in lots of countries.

        You could also use an office SIP phone and/or a telephony server (PBX) and a landline which might include recording capability. Or something like Google voice or a business phone provider which might offer email transcripts as part of their service. Or you get the audio via an (emulated) headset. Either cable or bluetooth. But that requires additional hardware. And check if your default dialer has some recording feature. I can enable that when opening the default dialer in the settings. And while in call a “record” button will appear next to “hold”. It’s not automatic though. You have to start a call and then manually record every one. And it might play a beep sound to the other party.

        Recording would be the first step. Without that working, we don’t need to discuss the next steps. But I agree, the next steps would be to chain something like Whisper or FasterWhisper and then an LLM.

        I suppose the next flagship android devices (Pixel?) come with similar features, as Google etc are pushing their AI services and features.

    • makingStuffForFunOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 months ago

      That isn’t such a bad idea.

      I could probably semi automate a lot of this stuff as long as I can get something to record the phone call.

      I’ll go down that route and see what I can find. Thank you so much I do appreciate it

    • farsinuce@feddit.dk
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      4 months ago

      It’d be perfect if you didn’t have to record the calls, but instead could use something like Faster-Whisper and saving live transcribed text files instead.