Xbox One Kinect Can Understand Two Voices Speaking at Once

Xbox One Kinect Can Understand Two Voices Speaking at Once

Players don't need to be polite, take turns talking to the Xbox One.

The Xbox One version of Kinect can detect and differentiate between multiple voices speaking at the same time, according to Microsoft VP Phil Harrison. Speaking at the Eurogamer Expo in London, Harrison explained that the Xbox One's new and improved motion control system can understand two distinct sets of voices, even when they're speaking at the same time. Microsoft new technology lead developer Nick Burton added that the Kinect will also detect whether players' mouths are moving, even in a dark room.

Microsoft has made it very clear that the Xbox One version of Kinect, which comes packaged with the upcoming console, will be a massive improvement over the first version of the peripheral. The Xbox One Kinect can detect 25 joints spread across as many as six people, calculate player heart rates and detect as many 1,400 points of articulation on their faces. The Kinect can also determine which people in the room are actively using the controller versus spectators.

Even though the Kinect is technically not a required component of the Xbox One experience, third-party developers are looking into new ways to use the peripheral's new range of features, ranging from Harmonix' use of musical-gestures in Fantasia to head-tracking in Battlefield 4.

Source: Polygon

Permalink

But can it understand this?

(and yes Kinect will be fully functional at launch for people in Scotland)

Michael Epstein:
The Xbox One Kinect can detect 25 joints spread across as many as six people..

image

Woah, man.

OT: The more of these things I hear about Kinect the more I remain unimpressed, but get increasingly scared instead. What does it do if those two simultaneous voices are each commanding the Xbone to do something? Please say "it explodes".

Ed130:
But can it understand this?

clip

(and yes Kinect will be fully functional at launch for people in Scotland)

I DID! I think my ears are done for the day, but it is in fact possible. Now the question is, can it understand that with bagpipes in the background (because all Scottish people have them in their houses, duh)?

And it can lip-read, even if it cannot hear what you said, ... Dave

TiberiusEsuriens:
I DID! I think my ears are done for the day, but it is in fact possible. Now the question is, can it understand that with bagpipes in the background (because all Scottish people have them in their houses, duh)?

I work on a busy central street in Glasgow and the frequency with which we have to tolerate bagpipers in earshot is maddening. People phone us, and yes, bloody pipes can heard in the background. They're not helping the stereotype!

Wow, the amount of information this thing can monitor is incredible. I don't want one anywhere near me.

Yeah that's cool, but if two people give it conflicting orders what will it do?

Whelp, if someone says "Xbox off", there's not a dang thing you can do about it or drown them out with. :P

While I do not appreciate the price bloat that comes from the new Kinect being shoved down our throats, I do have to admit that I am anticipating its various improvements over the original. It is certainly something I can see myself using more if it ends up being as functional as they say. That being said, the new technology doesn't mean squat if developers won't take advantage of it, which I think is a bit of a shame.

Okay, how the hell did the OP get banned? Was he a MS rep or something? O_o Nevermind, Site Hiccup on my end ^^'

Still, $100 dollars for THAT!? Sheesh, Priorities much?

So what happens if two people give it mutually exclusive orders at the same time? Like, one says "watch TV" and the other says "play Halo", or something like that?

DataSnake:
So what happens if two people give it mutually exclusive orders at the same time? Like, one says "watch TV" and the other says "play Halo", or something like that?

Probably one of the number of thigs that current hardware does when given conflicting instructions (often varying by program running on the same OS) it will likely perform either the first or last instruction until it completes, or possibly the instructions sequentially. Its a computer it will somehow register one before the other to decide this, the chances of two instructions finishing to close together for it to decide one was first are fairly small. I'm skeptical about how well this will work but that part is relatively simple, hell I'd not be suprised if they didnt really need to specifically code for it an just let it happen as its fairly likely one of the possibilities above will happen anyway.

Spaceman Spiff:
Wow, the amount of information this thing can monitor is incredible. I don't want one anywhere near me.

For the purposes of monertering you it makes far more sense to hide a microphone in it and process it at their HQ than use a dirty great obvious thing like kinect 2.0, that to boot can be unplugged.....

A mobile phone with inbuilt GPS and the like is a far more sensible starting point to monitor people with than a console.

I hope this happens

If only Xbox has those robot voices

[quote="Michael Epstein" post="7.829751.20219596"]The Xbox One version of Kinect can detect and differentiate between multiple voices speaking at the same time

No it won't. I'm telling you categorically right now that if two people are speaking at exactly the same time and are standing right next to each other, it won't have a clue what's being said. Understanding what a human voice is saying is really, really hard, and I know because I've dabbled in writing systems that do just that during my (long) career. Understanding a single voice is difficult enough (which is why there will only be a few countries supported by the damn thing), but as soon as you add any layer of mush over the top (be it a drill, a car, a TV in the background, or another human voice), the audio signal you get is a garbled mess. Sure, you can filter out what isn't likely to be a human voice - extremely high and low frequencies, for example - but two human voices at the same time occupy a similar range of pitch, so separating the two by using a generic catch-all algorithm in real-time is, basically, impossible.

However, bearing in mind that the Kinect has a 4 mic array, I'm guessing that it will rely on the two people speaking to be a good distance apart, and can then bias a particular person's voice using a particular input based on their location. So if I'm standing to the extreme left, and someone else is standing at the extreme right, then our voices would attempt to be decoded by isolating the mic input nearest to us. This is the only way I can think of that has any faint hope of working. And even then, it'll get it wrong for much of the time, I suspect.

If the two speakers are standing either right next to each other, or in a line, it won't have any idea what either person is saying.

And all of that would be so cool if it wasn't attached to a game console, in your living room, always watching.

EDIT:

Here's a random thought, will all this xbox tech be used to get through those Captcha systems that offer their text to be spoken?

lacktheknack:
Whelp, if someone says "Xbox off", there's not a dang thing you can do about it or drown them out with. :P

kinect should provide a disposable knife or other sharp object for situations like these so you can make sure it wont happen again.

DiamanteGeeza:

However, bearing in mind that the Kinect has a 4 mic array, I'm guessing that it will rely on the two people speaking to be a good distance apart, and can then bias a particular person's voice using a particular input based on their location. So if I'm standing to the extreme left, and someone else is standing at the extreme right, then our voices would attempt to be decoded by isolating the mic input nearest to us. This is the only way I can think of that has any faint hope of working. And even then, it'll get it wrong for much of the time, I suspect.

If the two speakers are standing either right next to each other, or in a line, it won't have any idea what either person is saying.

thats really the ONLY way you can ever do that anyway. if you use same mic for both voices the voicewaves overlap anyway and the software would ahve to be specifically told both people speak at same time and have its voice examples to comapre to even begin to try to udnerstand anything.
Though the more likely situation is actually that its more PR talk and it wont work. you know, pretty much like everything with old Kinect.

Strazdas:
thats really the ONLY way you can ever do that anyway. if you use same mic for both voices the voicewaves overlap anyway and the software would ahve to be specifically told both people speak at same time and have its voice examples to comapre to even begin to try to udnerstand anything.

Nope, you can't do it. It doesn't matter if you have prior, clear samples of the voices of the people talking, unless one person has a really high pitched voice, and the other has a very, very deep voice, it's simply not possible to separate the two if its recorded using a single mic.

DiamanteGeeza:

Strazdas:
thats really the ONLY way you can ever do that anyway. if you use same mic for both voices the voicewaves overlap anyway and the software would ahve to be specifically told both people speak at same time and have its voice examples to comapre to even begin to try to udnerstand anything.

Nope, you can't do it. It doesn't matter if you have prior, clear samples of the voices of the people talking, unless one person has a really high pitched voice, and the other has a very, very deep voice, it's simply not possible to separate the two if its recorded using a single mic.

I meant more towards the it would know the exact pitch level and would know what to look for. so yes it is the high versus deep. though that wouldnt be ap roblem for me personally as i got a really deep one, so deep in fact that quite a few internet voice chat programs filter me out as background noise (its always fun when that happens), but yep two voices in same mic is quite impossible in average household.

 

Reply to Thread

Log in or Register to Comment
Have an account? Login below:
With Facebook:Login With Facebook
or
Username:  
Password:  
  
Not registered? To sign up for an account with The Escapist:
Register With Facebook
Register With Facebook
or
Register for a free account here