Archives du mot-clé voice recognition

Inside the Newest Kinect for Windows SDK – Infrared Control

Inside the Newest Kinect for Windows SDK—Infrared ControlThe Kinect for Windows software development kit (SDK) October release was a pivotal update with a number of key improvements. One important update in this release is how control of infrared (IR) sensing capabilities has been enhanced to create a world of new possibilities for developers.

IR sensing is a core feature of the Kinect sensor, but until this newest release, developers were somewhat restrained in how they could use it. The front of the Kinect for Windows sensor has three openings, each housing a core piece of technology. On the left, there is an IR emitter, which transmits a factory calibrated pattern of dots across the room in which the sensor resides. The middle opening is a color camera. The third is the IR camera, which reads the dot pattern and can help the Kinect for Windows system software sense objects and people along with their skeletal tracking data.

Lire la suite

Inside the Kinect for Windows SDK Update with Peter Zatloukal and Bob Heddle

Now that the updated Kinect for Windows SDK  is available for download, Engineering Manager Peter Zatloukal and Group Program Manager Bob Heddle sat down to discuss what this significant update means to developers.

Bob Heddle demonstrates the new infrared functionality in the Kinect for Windows SDK 
Bob Heddle demonstrates the new infrared functionality in the Kinect for Windows SDK.

Why should developers care about this update to the Kinect for Windows Software Development Kit (SDK)?

Bob: Because they can do more stuff and then deploy that stuff on multiple operating systems!

Peter: In general, developers will like the Kinect for Windows SDK because it gives them what I believe is the best tool out there for building applications with gesture and voice.

In the SDK update, you can do more things than you could before, there’s more documentation, plus there’s a specific sample called Basic Interactions that’s a follow-on to our Human Interface Guidelines (HIG). Human Interface Guidelines are a big investment of ours, and will continue to be. First we gave businesses and developers the HIG in May, and now we have this first sample, demonstrating an implementation of the HIG. With it, the Physical Interaction Zone (PhIZ) is exposed. The PhIZ is a component that maps a motion range to the screen size, allowing users to comfortably control the cursor on the screen.

This sample is a bit hidden in the toolkit browser, but everyone should check it out. It embodies best practices that we described in the HIG and is can be re-purposed by developers easily and quickly.

Bob: First we had the HIG, now we have this first sample. And it’s only going to get better. There will be more to come in the future.

Why upgrade?

Bob: There’s no downside to upgrading, so everyone should do it today! There are no breaking changes; it’s fully compatible with previous releases of the SDK, it gives you better operating support reach, there are a lot of new features, and it supports distribution in more countries with localized setup and license agreements. And, of course, China is now part of the equation.

Peter: There are four basic reasons to use the Kinect for Windows SDK and to upgrade to the most recent version:

  • More sensor data are exposed in this release.
  • It’s easier to use than ever (more samples, more documentation).
  • There’s more operating system and tool support (including Windows 8, virtual machine support, Microsoft Visual Studio 2012, and Microsoft .NET Framework 4.5).
  • It supports distribution in more geographical locations. 

What are your top three favorite features in the latest release of the SDK and why?

Peter: If I must limit myself to three, then I’d say the HIG sample (Basic Interactions) is probably my favorite new thing. Secondly, there’s so much more documentation for developers. And last but not least…infrared! I’ve been dying for infrared since the beginning. What do you expect? I’m a developer. Now I can see in the dark!

Bob: My three would be extended-range depth data, color camera settings, and Windows 8 support. Why wouldn’t you want to have the ability to develop for Windows 8? And by giving access to the depth data, we’re giving developers the ability to see beyond 4 meters. Sure, the data out at that range isn’t always pretty, but we’ve taken the guardrails off—we’re letting you go off-roading. Go for it!

New extended-range depth data now provides details beyond 4 meters. These images show the difference between depth data gathered from previous SDKs (left) versus the updated SDK (right). 
New extended-range depth data now provides details beyond 4 meters. These images show the difference between depth data gathered from previous SDKs (left) versus the updated SDK (right).

Peter: Oh yeah, and regarding camera settings, in case it isn’t obvious: this is for those people who want to tune their apps specifically to known environments.

What’s it like working together?

Peter: Bob is one of the most technically capable program managers (PMs) I have had the privilege of working with.

Bob: We have worked together for so long—over a decade and in three different companies—so there is a natural trust in each other and our abilities. When you are lucky to have that, you don’t have to spend energy and time figuring out how to work together. Instead, you can focus on getting things done. This leaves us more time to really think about the customer rather than the division of labor.

Peter: My team is organized by the areas of technical affinity. I have developers focused on:

  • SDK runtime
  • Computer vision/machine learning
  • Drivers and low-level subsystems
  • Audio
  • Samples and tools

Bob: We have a unique approach to the way we organize our teams: I take a very scenario-driven approach, while Peter takes a technically focused approach. My team is organized into PMs who look holistically across what end users need, versus what commercial customers need, versus what developers need.

Peter: We organize this way intentionally and we believe it’s a best practice that allows us to iterate quickly and successfully!

What was the process you and your teams went through to determine what this SDK release would include, and who is this SDK for?

Bob: This SDK is for every Kinect for Windows developer and anyone who wants to develop with voice and gesture. Seriously, if you’re already using a previous version, there is really no reason not to upgrade. You might have noticed that we gave developers a first version of the SDK in February, then a significant update in May, and now this release. We have designed Kinect for Windows around rapid updates to the SDK; as we roll out new functionality, we test our backwards compatibility very thoroughly, and we ensure no breaking changes.

We are wholeheartedly dedicated to Kinect for Windows. And we’re invested in continuing to release updated iterations of the SDK rapidly for our business and developer customers. I hope the community recognizes that we’re making the SDK easier and easier to use over time and are really listening to their feedback.

Peter Zatloukal, Engineering Manager
Bob Heddle, Group Program Manager
Kinect for Windows

Related Links

Nissan Pathfinder Virtual Showroom is Latest Auto Industry Tool Powered by Kinect for Windows

(Please visit the site to view this video)

Automotive companies Audi, Ford, and Nissan are adopting Kinect for Windows as a the newest way to put a potential driver into a vehicle. Most car buyers want to get « hands on » with a car before they are ready to buy, so automobile manufacturers have invested in tools such as online car configurators and 360-degree image viewers that make it easier for customers to visualize the vehicle they want.

Now, Kinect's unique combination of camera, body tracking capability, and audio input can put the car buyer into the driver's seat in more immersive ways than have been previously possible—even before the vehicle is available on the retail lot!

The most recent example of this automotive trend is the 2013 Nissan Pathfinder application powered by Kinect for Windows, which was originally developed to demonstrate the new Pathfinder at auto shows before there was a physical car available.

Nissan quickly recognized the value of this application for building buzz at local dealerships, piloting it in 16 dealerships in 13 states nationwide.

« The Pathfinder application using Kinect for Windows is a game changer in terms of the way we can engage with consumers, » said John Brancheau, vice president of marketing at Nissan North America. « We’re taking our marketing to the next level, creating experiences that enhance the act of discovery and generate excitement about new models before they’re even available. It’s a powerful pre-sales tool that has the potential to revolutionize the dealer experience. »

Digital marketing agency Critical Mass teamed with interactive experience developer IdentityMine to design and build the Kinect-enabled Pathfinder application for Nissan. « We’re pioneering experiences like this one for two reasons: the ability to respond to natural human gestures and voice input creates a rich experience that has broad consumer appeal, » notes Critical Mass President Chris Gokiert. « Additionally, the commercial relevance of an application like this can fulfill a critical role in fueling leads and actually helping to drive sales on site. »

Each dealer has a kiosk that includes a Kinect for Windows sensor, a monitor, and a computer that’s running the Pathfinder application built with the Kinect for Windows SDK. Since the Nissan Pathfinder application first debuted at the Chicago Auto Show in February 2012, developers made several enhancements, including a new pop-up tutorial, and interface improvements, such as larger interaction icons and instructional text along the bottom of the screen so a customer with no Kinect experience could jump right in. "In the original design for the auto show, the application was controlled by a trained spokesperson. That meant aspects like discoverability and ease-of-use for first-time users were things we didn’t need to design for," noted IdentityMine Research Director Evan Lang.

Now, shoppers who approach the Kinect-based showroom are guided through an array of natural movements—such as extending their hands, stepping forward and back, and leaning from side to side—to activate hotspots on the Pathfinder model, allowing them to inspect the car inside and out.

Shoppers who approach the Kinect-based showroom are guided through an array of natural movements that allow them to inspect the car inside and out.The project was not, however, without a few challenges. The detailed Computer-Aided Design (CAD) model data provided by Nissan, while ideal for commercials and other post-rendered uses, did not lend itself easily to a real-time engine. "A lot of rework was necessary that involved 'retopolgizing' the mesh," reported IdentityMine’s 3D Design Lead Howard Schargel. "We used the original as a template and traced over to get a cleaner, more manageable polygon count. We were able to remove much more than half of the original polygons, allowing for more fluid interactions and animations while still retaining the fidelity of the client's original model."

And then, the development team pushed further. "The application uses a dedicated texture to provide a dynamic, scalable level of detail to the mesh by adding or removing polygons, depending on how close it is to the camera,” explained Schargel. “It may sound like mumbo jumbo—but when you see it, you won't believe it."

You can see the Nissan Pathfinder app in action at one of the 16 participating dealerships or by watching our video case study.

Kinect for Windows Team

Key Links

Kinect for Windows at Imagine Cup 2012 Finals

The Imagine Cup competition—which recently concluded its tenth year—throws the spotlight on cutting-edge innovations. Two-thirds of the education-focused projects utilized Microsoft Kinect in a variety of different ways, including interactive therapy for stroke victims, an automated cart to help make solo trips to crowded public places manageable for the disabled, and an application to help dyslexic children learn the alphabet.

Team Wi-GO of Portugal invented a Kinect-enabled cart to aid the disabled.

Team Wi-GO of Portugal invented a Kinect-enabled cart to aid the disabled.

Students from 75 countries participated in the Imagine Cup Finals, held July 6 to 11 in Sydney, Australia, which featured more than 100 projects. Kinect for Windows played a significant role in this year's competition, with 28 Kinect-enabled projects across multiple categories—including Software Design, Game Design, Windows Azure, and a Fun Labs Challenge that was focused entirely on Kinect.

With the goal of using technology to help solve the world’s toughest problems, students put Kinect to work providing the digital eyes, ears, and tracking capabilities needed for a range of potential new products and applications. We applaud all of the teams who incorporated Kinect for Windows into their projects this year! Here are highlights from a few of them:

  • Third-place Software Design Category: Team wi-GO (Portugal) designed a cart to free the hands of a person in a wheelchair. It tracks the person seated in the chair while avoiding obstacles (including other people) when navigating through crowded stores, malls, airports, hospitals, and more. The solution may even have industrial applications, serving as a tool to transport objects without the need for human assistance.
    Tools: Kinect for Windows, Windows 8, and Netduino open-source electronics platform with .NET Micro Framework
  • Second-place Kinect Fun Labs Challenge: Team Whiteboard Pirates (United States) developed Duck Duck Punch, a "game" that provides therapy to people who have experienced strokes and need help improving their range-of-arm motion. This “game” has the patient stretch to hit digital birds within prescribed limits; physical therapists can tailor the experience to each individual’s needs.
    Tools: Kinect for Windows and Kinect Gadget Accelerator Kit
  • Third-place Kinect Fun Labs Challenge: Team Flexify (Poland) made Reh the Dragon, a rehabilitation application that transforms tedious rehabilitation exercises for children into a fun and engaging game-like adventure.
    Tools: Kinect for Windows and XNA Game Studio
  • Health Awareness Award: Italian Ingenium Team (Italy) developed The Fifth Element Project, which uses Kinect voice recognition and motion detection to help autistic children learn through play and movement.
    Tools: Kinect for Xbox 360, Windows Azure, Windows 7, and Windows 8
  • People’s Choice Award: The D Labs (India) built a tool for children who have dyslexia that aids in alphabet identification and other skills while tracking behavioral patterns.
    Tools: Kinect for Xbox 360, Microsoft Silverlight, Windows Azure, XNA Game Studio, and Windows 8
  • Finalist: Make a Sign (Belgium) created a sign language database, complete with Kinect motion tracking that confirms when a gesture is performed correctly.
    Tools: Kinect for Xbox 360, Windows Phone, and Windows Azure

« Imagine Cup is about giving students the resources and tools they need to succeed and then getting out of their way and letting them create, » said Walid Abu-Hadba, corporate vice president of Microsoft’s Developer and Platform Evangelism group. « Kinect in particular is unlocking a new class of interactive solutions. It’s inspiring to watch the way students from a multitude of backgrounds find common ground as they combine their love of technology with their determination to make a difference. It’s amazing. »

We look forward to next year’s Imagine Cup. In the meantime, keep up the great work.

Kinect for Windows Team

Key Links

• Kinect for Windows Gallery
• Imagine Cup website
• Imagine Cup winners and finalists
• Team wi-GO
• Team Whiteboard Pirates
• Team Flexify
• Italian Ingenium Team
• The D Labs
• Make a Sign