R/E/P Community

Please login or register.

Login with username, password and session length
Advanced search  

Pages: [1]   Go Down

Author Topic: Can a mono conversation recordings be split into two unique individual tracks?  (Read 2036 times)

WayneC

  • Newbie
  • *
  • Offline Offline
  • Posts: 1
  • Real Full Name: Wayne Campbell

My employer uses my audio skills frequently on many projects and they just gave me one that I have no idea how to solve.
I work for a Call Center Company and we receive a lot of calls from customers that wind up with a lot of talk over.
In other words the agent and the customer are speaking at the same time.
They have asked me to find a way to split the conversations into two tracks to see this or visually show where two people are speaking at once.  They wish to identify call center agents that do this more frequently than others for retraining purposes.
Listening to thousands of conversations to find those agents that do this frequently would be to say the least extremely time consuming.
I've insisted that a wave file can not show this, nor spectral analysis.  If a method or package exists to do this, I'm not aware of it.  I've even spent time looking at audio forensic packages, but that does not solve this.
Any one with any ideas out there?
thanks!
Logged

Tim Halligan

  • Jr. Member
  • **
  • Offline Offline
  • Posts: 84

The one remote possibility that springs to mind is looking at amplitude.

Areas where there is talk over should be louder that when either end of the conversation talks alone.

If you could programme a gate or expander to open when the amplitude is greater than the average level, then perhaps you'd have a shot.

I imagine that it would have to analyse each conversation to determine what the average level is for that unique conversation, so it may be a 2-pass process.

Other than that, I got nothing.

HTH

Cheers,
Tim
Logged
An analogue brain in a digital world.

Fletcher

  • Administrator
  • Hero Member
  • *****
  • Offline Offline
  • Posts: 589

It seems to me that the easiest way to accomplish the goal of identifying the "call agents" who talk over the "customers" for "retraining" would be for the "call agent" to give their "employee identification number" at the beginning of every conversation.

While there may indeed be some software used in the world of "forensic audio" that can do this separation... being a "music" guy I'm entirely unaware of its existence. 

Best of luck with your search.

Peace
Logged
CN Fletcher

mwagener wrote on Sat, 11 September 2004 14:33
We are selling emotions, there are no emotions in a grid


"Recording engineers are an arrogant bunch
If you've spent most of your life with a few thousand dollars worth of musicians in the studio, making a decision every second and a half... and you and  they are going to have to live with it for the rest of your lives, you'll get pretty arrogant too.  It takes a certain amount of balls to do that... something around three"
Malcolm Chisholm

jaykadis

  • R/E/P Forums
  • Jr. Member
  • *
  • Offline Offline
  • Posts: 57

There's actually a lot of interest in this kind of task in the audio perception research world. Sound source separation attempts to duplicate the brain's ability to follow a single conversation in the midst of other talkers. So far, there isn't a commercial product that can do this as far as I know.


Do you not have independent access to the call agent's audio stream? That would be the easiest way to do this.

radardoug

  • Full Member
  • ***
  • Offline Offline
  • Posts: 100
  • Real Full Name: Doug Jane

Modify your recording system to record the two sides of the stream separately.
Logged

Fletcher

  • Administrator
  • Hero Member
  • *****
  • Offline Offline
  • Posts: 589

Separation of these signals is not possible with current telephone technology which mixes both sides of the conversation into one mono event -- even if you could split the signal on one end, you couldn't on the other as its not the way telephones work.

Peace
Logged
CN Fletcher

mwagener wrote on Sat, 11 September 2004 14:33
We are selling emotions, there are no emotions in a grid


"Recording engineers are an arrogant bunch
If you've spent most of your life with a few thousand dollars worth of musicians in the studio, making a decision every second and a half... and you and  they are going to have to live with it for the rest of your lives, you'll get pretty arrogant too.  It takes a certain amount of balls to do that... something around three"
Malcolm Chisholm

Augustine Leudar

  • Newbie
  • *
  • Offline Offline
  • Posts: 9

If you want to do it automatically youd have to develop your own software. You could use speech recognition to have a profile for each agent and then use that with the amplitude reading to work out when he's talking over someone else. Take a long time to program !
Logged
Pages: [1]   Go Up