Timing is one of the things I've found most difficult to pick up while learning basic gymnastics. This is especially true when it comes to high bar/strap bar giants and tap swings. If I just blindly practice, I tend to tap far too early. For back giants, the tap is the moment when you transition from a hollow to an arch then back to a hollow again. To facilitate switching from hollow to arch at the right time I built the device you see in the video below. I suspect it would also work well for kip timing (it would tell you when to bring toes to the bar). It's basically a beam break sensor, like you might have at the bottom of your garage door, except it runs on batteries and beeps whenever something crosses between the flashlight and sensor. The idea is that you place the flashlight and sensor on either side of the high-bar at the location where you should ideally tap and then as you swing you just wait to hear the beep before switching from hollow to arch.
Here you can see and hear the tap trainer in action- I know my form is terrible, the video is just to show you how the device works. That high pitched beep you hear every time I approach maybe 30 degrees away from vertical is the sensor detecting me crossing its path and then beeping.
Keyboardmods
Input device reviews and custom keyboards.
Wednesday, January 11, 2012
Monday, October 17, 2011
Kinect speech recognition in linux
Audio support is now part of libfreenect. Additionally it is now possible to load the microsoft SDK version of the audio firmware from linux courtesy of a utility called kinect_upload_fw written by Antonio Ospite.This version of the firmware makes the kinect appear to your computer as a standard USB microphone.
This means you can now record audio using your kinect, but that's not all that interesting in and of itself. Linux support for speech recognition at this point is not all that great. It is possible to run dragon naturallyspeaking via wine or to use the sphinx project (after much training), but neither of those approaches really appealed to me for simple voice commands (as opposed to dictation). The google android project happens to include a speech recognizer from Nuance which by default is meant to be built for an ARM target, like your phone. After extensive hacking around the build system I was able to instead build for an x86 target, like your desktop. Now, you can combine these two things- kinect array microphone + android voice recognition to do some more interesting things, i.e. toggle hand tracking on and off via voice.
How to get started:
1) Check if you have the "unbuffer" application which is part of the linux scripting language called expect:
If the above command comes up empty you should download a copy of unbuffer from the link here:
http://dl.dropbox.com/u/11217419/unbuffer
copy unbuffer to a directory that is in your path, like /usr/local/bin or ~/bin
2)Download my precompiled version of the srec subproject from here:
http://dl.dropbox.com/u/11217419/srec_kinect.tgz
3)save the tarball from step 1 in a convenient directory then unpack it with this command:
4)switch into the subdirectory where I've placed some convenience scripts:
5) Open a second terminal and in that second terminal also switch into srec/config/en.us
6) In the first terminal execute
and in the other terminal execute
7) try speaking into your microphone and wait for recognition results to appear in both terminals. Note that the vocabulary as configured at this point is very small- words like up,down,left,right and the numbers from 1-9 should be recognized properly.
Integrating the kinect:
1)Acquire Antonio Ospite's firmware tools like so:
2)move into the kinect-audio-setup subdirectory:
3)build kinect_upload_fw as root:
4)Fetch and extract the microsoft kinect SDK audio firmware (depending on your directory permissions, this may also need to be run as root):
This will extract the firmware to this location by default:
5)Upload the newly extracted firmware to the kinect:
6)Check for a new USB audio device in your dmesg output
7)Configure the kinect USB audio device to be your primary microphone input and
try out run_SRecTestAudio.sh again as described earlier.
Additional Notes:
I unfortunately no longer remember all the changes I had to make in order for the srec project within android build for x86. Perhaps someone with better knowledge of the android build system can chime in at the comments below. In the interim, use the precompiled copy that I have linked above, just be aware that it is old, I think it dates back to the froyo branch of android or earlier (I compiled it a long time ago). If you want to take a shot at building the latest srec yourself, check out the android source code then look under external/srec/
The run_SRecTestAudio.sh script sets up the speech recognizer to run on live audio and pipes the recognition results to a fifo in the same directory called speech_fifo. Running cat in the second terminal lets you read out the recognition results as they arrive. Instead of cat you could alternatively have whatever programs needs recognition results read from the fifo and act accordingly. Unbuffer is used to make sure you see recognition results right away rather than waiting for the speech_fifo to fill up.
The srec recognizer does not require any training but has certain limitations. The most significant limitation is the vocabulary it can recognize. The larger the vocabulary you specify, the less accurate the recognition results will likely be. As a result this recognizer is best used for a small set of frequently used voice commands. Under srec/config/en.us/grammars/ there are a number of .grxml files which define what words the recognizer can understand. You can define your own simple grammar (.grxml) here which, for example, only recognizes the digits on a phone keypad. To do this you can follow the syntax of any of the other .grxml files in the directory and then execute run_compile_grammars.sh which will produce a .g2g file from the .grxml file. There is also a voicetag/texttag file with extension .tcp which needs to point to the g2g file of your choice. You can find the .tcp files under the srec/config/en.us/tcp directory. run_SRecTestAudio.sh points to a tcp file which you can specify.
This means you can now record audio using your kinect, but that's not all that interesting in and of itself. Linux support for speech recognition at this point is not all that great. It is possible to run dragon naturallyspeaking via wine or to use the sphinx project (after much training), but neither of those approaches really appealed to me for simple voice commands (as opposed to dictation). The google android project happens to include a speech recognizer from Nuance which by default is meant to be built for an ARM target, like your phone. After extensive hacking around the build system I was able to instead build for an x86 target, like your desktop. Now, you can combine these two things- kinect array microphone + android voice recognition to do some more interesting things, i.e. toggle hand tracking on and off via voice.
How to get started:
1) Check if you have the "unbuffer" application which is part of the linux scripting language called expect:
which unbufferIf the above command comes up empty you should download a copy of unbuffer from the link here:
http://dl.dropbox.com/u/11217419/unbuffer
copy unbuffer to a directory that is in your path, like /usr/local/bin or ~/bin
2)Download my precompiled version of the srec subproject from here:
http://dl.dropbox.com/u/11217419/srec_kinect.tgz
3)save the tarball from step 1 in a convenient directory then unpack it with this command:
tar xfz srec_kinect.tgz4)switch into the subdirectory where I've placed some convenience scripts:
cd srec/config/en.us5) Open a second terminal and in that second terminal also switch into srec/config/en.us
6) In the first terminal execute
./run_SRecTestAudio.sh and in the other terminal execute
cat speech_fifo7) try speaking into your microphone and wait for recognition results to appear in both terminals. Note that the vocabulary as configured at this point is very small- words like up,down,left,right and the numbers from 1-9 should be recognized properly.
Integrating the kinect:
1)Acquire Antonio Ospite's firmware tools like so:
git clone http://git.ao2.it/kinect-audio-setup.git/ 2)move into the kinect-audio-setup subdirectory:
cd kinect-audio-setup3)build kinect_upload_fw as root:
make install4)Fetch and extract the microsoft kinect SDK audio firmware (depending on your directory permissions, this may also need to be run as root):
./kinect_fetch_fw /lib/firmware/kinectThis will extract the firmware to this location by default:
/lib/firmware/kinect/UACFirmware.C9C6E852_35A3_41DC_A57D_BDDEB43DFD04 5)Upload the newly extracted firmware to the kinect:
kinect_upload_fw /lib/firmware/kinect/UACFirmware.C9C6E852_35A3_41DC_A57D_BDDEB43DFD046)Check for a new USB audio device in your dmesg output
7)Configure the kinect USB audio device to be your primary microphone input and
try out run_SRecTestAudio.sh again as described earlier.
Additional Notes:
I unfortunately no longer remember all the changes I had to make in order for the srec project within android build for x86. Perhaps someone with better knowledge of the android build system can chime in at the comments below. In the interim, use the precompiled copy that I have linked above, just be aware that it is old, I think it dates back to the froyo branch of android or earlier (I compiled it a long time ago). If you want to take a shot at building the latest srec yourself, check out the android source code then look under external/srec/
The run_SRecTestAudio.sh script sets up the speech recognizer to run on live audio and pipes the recognition results to a fifo in the same directory called speech_fifo. Running cat in the second terminal lets you read out the recognition results as they arrive. Instead of cat you could alternatively have whatever programs needs recognition results read from the fifo and act accordingly. Unbuffer is used to make sure you see recognition results right away rather than waiting for the speech_fifo to fill up.
The srec recognizer does not require any training but has certain limitations. The most significant limitation is the vocabulary it can recognize. The larger the vocabulary you specify, the less accurate the recognition results will likely be. As a result this recognizer is best used for a small set of frequently used voice commands. Under srec/config/en.us/grammars/ there are a number of .grxml files which define what words the recognizer can understand. You can define your own simple grammar (.grxml) here which, for example, only recognizes the digits on a phone keypad. To do this you can follow the syntax of any of the other .grxml files in the directory and then execute run_compile_grammars.sh which will produce a .g2g file from the .grxml file. There is also a voicetag/texttag file with extension .tcp which needs to point to the g2g file of your choice. You can find the .tcp files under the srec/config/en.us/tcp directory. run_SRecTestAudio.sh points to a tcp file which you can specify.
Thursday, February 24, 2011
Kinect audio reverse engineering
I did some work on getting the kinect audio hardware to work as part of openkinect/libfreenect a while back. Here's some quick notes on what I've figured out and how:
Once the audio firmware has been loaded the kinect sends 524 bytes to the xbox every 1ms, every tenth packet is short (60 bytes) but potentially preceded by an empty packet. The short packets appear to be non-audio data (maybe signaling of some sort) because if you exclude them the resulting data doesn't appear to have any gaps.
The audio samples appear to be 32 bits signed each at 16khz (if you assume that sample rate then the FFT of the recorded data has the correct frequency).
The 4 channels seem to be transmitted in order left to right from the perspective of someone looking at the front of the kinect. The leftmost channel is transmitted first. 256 samples of each channel are transmitted before switching to the next channel.
if you stitch together disparate 256 sample blocks to reconstruct a given channel the data appears to be continuous. The plot below shows the captured 4-channel audio stream with the channels labeled from left to right as 1,2,3,4. You can see that the leftmost channel has the greatest amplitude corresponding to the fact that the speaker was placed closest to the leftmost microphone. I repeated the test with a speaker near the rightmost microphone and, as expected, channel 4 became the strongest.

I was able to determine the information in the last paragraph by synthesizing a 700Hz sine wave in matlab and then playing it back at the kinect with a speaker nearest the leftmost microphone (as seen from the front of the kinect). I then captured the data stream coming back from the kinect while I played the sine wave using a beagle USB sniffer. I extracted the 524 byte blocks I suspected to be audio from the beagle dumps and then post processed them with a series of shell scripts before reading them into matlab and plotting the FFT of this audio as seen below:

The frequency shown by the FFT is correctly 700Hz(approx.) This suggests that my interpretation of the audio format is correct.
Firmware loading process
I've managed to duplicate so far what I think is most of the init sequence-
I send all the same control transfers and bulk transfers as the Xbox,
as far as I can tell. My beagle480 confirms that I mirror the Xbox behavior for the most part. After completing a series of 512 byte bulk-out transfers which I
assume is some sort of bootstrapping firmware upload, the audio device
re-enumerates, I wait for that to happen and open the new audio
device, then send some more control and bulk transfers. So far ,so
good, this all follows what I see in the Xbox logs. At this point the
Xbox appears to send 12 cycles of ( 1 xfer: 0 byte iso IN, 8 xfers 4
bytes out) which I also duplicate perfectly. Now, the final step is a
very long stream of (1 xfer: 0 byte iso IN, 8 xfers 76 bytes out)
before eventually those 0 byte IN transfers become 524 bytes
transfers. Unfortunately it seems the content of those 76 byte OUT
transfers must matter because after trying all zeros I never get any
data back in my IN transfers (even after >5000 IN transfers). I have
some scripts I'll use to try generate code for all those OUT transfers
directly from the .tdc files.
Once the audio firmware has been loaded the kinect sends 524 bytes to the xbox every 1ms, every tenth packet is short (60 bytes) but potentially preceded by an empty packet. The short packets appear to be non-audio data (maybe signaling of some sort) because if you exclude them the resulting data doesn't appear to have any gaps.
The audio samples appear to be 32 bits signed each at 16khz (if you assume that sample rate then the FFT of the recorded data has the correct frequency).
The 4 channels seem to be transmitted in order left to right from the perspective of someone looking at the front of the kinect. The leftmost channel is transmitted first. 256 samples of each channel are transmitted before switching to the next channel.
if you stitch together disparate 256 sample blocks to reconstruct a given channel the data appears to be continuous. The plot below shows the captured 4-channel audio stream with the channels labeled from left to right as 1,2,3,4. You can see that the leftmost channel has the greatest amplitude corresponding to the fact that the speaker was placed closest to the leftmost microphone. I repeated the test with a speaker near the rightmost microphone and, as expected, channel 4 became the strongest.

I was able to determine the information in the last paragraph by synthesizing a 700Hz sine wave in matlab and then playing it back at the kinect with a speaker nearest the leftmost microphone (as seen from the front of the kinect). I then captured the data stream coming back from the kinect while I played the sine wave using a beagle USB sniffer. I extracted the 524 byte blocks I suspected to be audio from the beagle dumps and then post processed them with a series of shell scripts before reading them into matlab and plotting the FFT of this audio as seen below:

The frequency shown by the FFT is correctly 700Hz(approx.) This suggests that my interpretation of the audio format is correct.
Firmware loading process
I've managed to duplicate so far what I think is most of the init sequence-
I send all the same control transfers and bulk transfers as the Xbox,
as far as I can tell. My beagle480 confirms that I mirror the Xbox behavior for the most part. After completing a series of 512 byte bulk-out transfers which I
assume is some sort of bootstrapping firmware upload, the audio device
re-enumerates, I wait for that to happen and open the new audio
device, then send some more control and bulk transfers. So far ,so
good, this all follows what I see in the Xbox logs. At this point the
Xbox appears to send 12 cycles of ( 1 xfer: 0 byte iso IN, 8 xfers 4
bytes out) which I also duplicate perfectly. Now, the final step is a
very long stream of (1 xfer: 0 byte iso IN, 8 xfers 76 bytes out)
before eventually those 0 byte IN transfers become 524 bytes
transfers. Unfortunately it seems the content of those 76 byte OUT
transfers must matter because after trying all zeros I never get any
data back in my IN transfers (even after >5000 IN transfers). I have
some scripts I'll use to try generate code for all those OUT transfers
directly from the .tdc files.
Friday, December 17, 2010
HOWTO: use the kinect as a mouse in linux
In an earlier post I explained how to get PrimeSense's NITE up and running and how to use the samples they provided. Now some people might be thinking "cool, but how can I use this?" I thought using NITE hand tracking to control the cursor would be a good and simple demonstration.
The linux kernel provides a means to create userspace input drivers using a feature called uinput. If you compile your kernel with uinput enabled as a module you can then simply:
modprobe uinputto load the uinput module. Once the module is loaded you can use the piece of code I've embedded below to convert the coordinates output by the NITE code into actual mouse/cursor movement. In short:
(1) download the code below
(2) save it as ~/kinect/NITE/Nite-1.3.0.17/Samples/SingleControl/main.cpp (you might want to back up the original)
(3)
cd ~/kinect/NITE/Nite-1.3.0.17 && make(4)Note: do the following as root or using sudo
~/kinect/NITE/Nite-1.3.0.17/Samples/Bin/Sample-SingleControl (5)Perform a focus gesture to start the hand tracking (check out my video above to see how to do that)
At this point you should be able to do what I do in the video above. You can also extend the code to generate mouse clicks, keystrokes, etc. Have fun.
The formatting of the code below got a bit mangled, so here's a direct download link:
http://dl.dropbox.com/u/11217419/main.cpp
If I wind up spending more time on the code then I might bother to put it up on github or something, but for now sorry about the dropbox only.
/****************************************************************************
* *
* Nite 1.3 - Single Control Sample *
* *
* Author: Oz Magal *
* *
****************************************************************************/
/****************************************************************************
* *
* Nite 1.3 *
* Copyright (C) 2006 PrimeSense Ltd. All Rights Reserved. *
* *
* This file has been provided pursuant to a License Agreement containing *
* restrictions on its use. This data contains valuable trade secrets *
* and proprietary information of PrimeSense Ltd. and is protected by law. *
* *
****************************************************************************/
//-----------------------------------------------------------------------------
// Headers
//-----------------------------------------------------------------------------
// General headers
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
// OpenNI headers
#include
// NITE headers
#include
#include "XnVMultiProcessFlowClient.h"
#include
#include "kbhit.h"
#include "signal_catch.h"
// xml to initialize OpenNI
#define SAMPLE_XML_FILE "../../Data/Sample-Tracking.xml"
static int uinp_fd = -1;
struct uinput_user_dev uinp; // uInput device structure
struct input_event event; // Input device structure
XnBool g_bQuit = false;
int setup_uinput_device()
{
int i=0;
uinp_fd = open("/dev/uinput", O_WRONLY | O_NDELAY);
if (!uinp_fd)
{
printf("Unable to open /dev/uinput\n");
return -1;
}
memset(&uinp,0,sizeof(uinp)); // Intialize the uInput device to NULL
strncpy(uinp.name, "Kinect Mouse", UINPUT_MAX_NAME_SIZE);
uinp.id.version = 4;
uinp.id.bustype = BUS_USB;
ioctl(uinp_fd, UI_SET_EVBIT, EV_KEY);
ioctl(uinp_fd, UI_SET_EVBIT, EV_REL);
ioctl(uinp_fd, UI_SET_RELBIT, REL_X);
ioctl(uinp_fd, UI_SET_RELBIT, REL_Y);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_MOUSE);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_TOUCH);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_MOUSE);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_LEFT);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_MIDDLE);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_RIGHT);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_FORWARD);
ioctl(uinp_fd, UI_SET_KEYBIT, BTN_BACK);
write(uinp_fd, &uinp, sizeof(uinp));
if (ioctl(uinp_fd, UI_DEV_CREATE))
{
printf("Unable to create UINPUT device.");
return -1;
}
return 1;
}
void move_cursor(int x, int y )
{
memset(&event, 0, sizeof(event));
gettimeofday(&event.time, NULL);
event.type = EV_REL;
event.code = REL_X;
event.value = x;
write(uinp_fd, &event, sizeof(event));
event.type = EV_REL;
event.code = REL_Y;
event.value = y;
write(uinp_fd, &event, sizeof(event));
event.type = EV_SYN;
event.code = SYN_REPORT;
event.value = 0;
write(uinp_fd, &event, sizeof(event));
}
//-----------------------------------------------------------------------------
// Callbacks
//-----------------------------------------------------------------------------
// Callback for when the focus is in progress
void XN_CALLBACK_TYPE SessionProgress(const XnChar* strFocus, const XnPoint3D& ptFocusPoint, XnFloat fProgress, void* UserCxt)
{
printf("Session progress (%6.2f,%6.2f,%6.2f) - %6.2f [%s]\n", ptFocusPoint.X, ptFocusPoint.Y, ptFocusPoint.Z, fProgress, strFocus);
}
// callback for session start
void XN_CALLBACK_TYPE SessionStart(const XnPoint3D& ptFocusPoint, void* UserCxt)
{
printf("Session started. Please wave (%6.2f,%6.2f,%6.2f)...\n", ptFocusPoint.X, ptFocusPoint.Y, ptFocusPoint.Z);
}
// Callback for session end
void XN_CALLBACK_TYPE SessionEnd(void* UserCxt)
{
printf("Session ended. Please perform focus gesture to start session\n");
}
// Callback for wave detection
void XN_CALLBACK_TYPE OnWaveCB(void* cxt)
{
printf("Wave!\n");
}
// callback for a new position of any hand
void XN_CALLBACK_TYPE OnPointUpdate(const XnVHandPointContext* pContext, void* cxt)
{
//printf("%d: (%f,%f,%f) [%f]\n", pContext->nID, pContext->ptPosition.X, pContext->ptPosition.Y, pContext->ptPosition.Z, pContext->f
Time);
//printf("%f,%f,%f\n", pContext->ptPosition.X, pContext->ptPosition.Y, pContext->ptPosition.Z);
move_cursor((int)(pContext->ptPosition.X/4),(int) -(pContext->ptPosition.Y/4));
}
//-----------------------------------------------------------------------------
// Main
//-----------------------------------------------------------------------------
// this sample can run either as a regular sample, or as a client for multi-process (remote mode)
int main(int argc, char** argv)
{
xn::Context context;
XnVSessionGenerator* pSessionGenerator;
XnBool bRemoting = FALSE;
if (argc > 1)
{
// remote mode
context.Init();
printf("Running in 'Remoting' mode (Section name: %s)\n", argv[1]);
bRemoting = TRUE;
// Create multi-process client
pSessionGenerator = new XnVMultiProcessFlowClient(argv[1]);
XnStatus rc = ((XnVMultiProcessFlowClient*)pSessionGenerator)->Initialize();
if (rc != XN_STATUS_OK)
{
printf("Initialize failed: %s\n", xnGetStatusString(rc));
delete pSessionGenerator;
return 1;
}
}
else
{
// Local mode
// Create context
XnStatus rc = context.InitFromXmlFile(SAMPLE_XML_FILE);
if (rc != XN_STATUS_OK)
{
printf("Couldn't initialize: %s\n", xnGetStatusString(rc));
return 1;
}
// Create the Session Manager
pSessionGenerator = new XnVSessionManager();
rc = ((XnVSessionManager*)pSessionGenerator)->Initialize(&context, "Click", "RaiseHand");
if (rc != XN_STATUS_OK)
{
printf("Session Manager couldn't initialize: %s\n", xnGetStatusString(rc));
delete pSessionGenerator;
return 1;
}
// Initialization done. Start generating
context.StartGeneratingAll();
}
// Register session callbacks
pSessionGenerator->RegisterSession(NULL, &SessionStart, &SessionEnd, &SessionProgress);
// Start catching signals for quit indications
CatchSignals(&g_bQuit);
// init & register wave control
XnVWaveDetector wc;
wc.RegisterWave(NULL, OnWaveCB);
wc.RegisterPointUpdate(NULL, OnPointUpdate);
pSessionGenerator->AddListener(&wc);
printf("Please perform focus gesture to start session\n");
printf("Hit any key to exit\n");
setup_uinput_device();
// Main loop
while ((!_kbhit()) && (!g_bQuit))
{
if (bRemoting)
{
((XnVMultiProcessFlowClient*)pSessionGenerator)->ReadState();
}
else
{
context.WaitAndUpdateAll();
((XnVSessionManager*)pSessionGenerator)->Update(&context);
}
}
delete pSessionGenerator;
context.Shutdown();
ioctl(uinp_fd, UI_DEV_DESTROY);
close(uinp_fd);
return 0;
}
Friday, December 10, 2010
HOWTO: Kinect + OpenNI/NITE skeleton tracking and gesture recognition in gentoo
Thanks to the folks at PrimeSense libraries are now available for skeleton tracking and gesture recognition.
UPDATE: Check here if you've gotten NITE working and want to try using the kinect as a Minority Report style mouse.
UPDATE:I've added a description of how to track multiple hands under the Sample-PointViewer description.
Here's how I got things working in gentoo:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11) go to this page at openNI to download the latest NITE release for your platform: NITE download page or for the impatient:
32-bit
64-bit
UPDATE: download links now point to openNI and should work again
(12)Save the NITE tarball to ~/kinect and untar it
(13)
(14)Open Sample-User.xml and replace the existing License line with the line below:
NOTE: this is case sensitive!
< License vendor="PrimeSense" key="0KOIk2JeIBYClPWVnMoRKn5cdY4="/>
(15)Repeat step 14 for Sample-Scene.xml and Sample-Tracking.xml
(16)Open Sample-User.xml and replace the existing MapOutputMode line with the line below.
NOTE: this is case sensitive!
< MapOutputMode xRes="640" yRes="480" FPS="30"/>
(17)Repeat step 16 for Sample-Scene.xml and Sample-Tracking.xml
(18)
(19)
(20)
(21)
(22)
Now finally you should be sitting in a directory with all the sample binaries that you can play with. Here's what they should look like:
Sample-TrackPad:
This app will track your hand and show it's relative position on a grid. Run it and wave your hand,one of the squares on the grid should turn yellow to indicate your hand's location as seen below:

you should also get some debug output in your console:

Sample-Players
This app demonstrates the skeletal tracking. After starting it up, move around or wave until your body changes to blue (subsequent players will be other colors, e.g. player 2 is green, 3 yellow,etc.). At this point your viewer window should look vaguely like this:

and you should see something like this in your console:
Now, hold your arms out to your sides bent 90 degrees at the elbows as shown below until a skeleton is overlayed on the image of your body:

At this point something like this should have appeared in your console:
Sample-SingleControl
This seems to do some sort of gesture recognition and dynamically adjusts the camera resolution, so it's probably zooming in on an area of interest. When it starts out it asks you to perform a focus gesture. The NITE documentation doesn't seem to define what this would be but simply sticking one hand out in front of you seems to make it happy and you'll see the following output:

SamplePointViewer
This app does handtracking. UPDATE: to allow multiple hands to be tracked you will need to edit /usr/etc/primesense/XnVHandGenerator/Nite.ini by uncommenting the two config parameters it contains. Basically remove the semicolons at the start of each line so that Nite.ini looks like this:
To persistently track different hands in your code you can make use of the XnVHandPointContext.nID in your OnPointUpdate callback.

Sample-Boxes
This example allows you to click one of three boxes, your hand motion is tracked by a slider and depending on the context, up, left, right gestures will be recognized.

Sample-CircleControl
Wave to make the border of the window turn green. Then I think you need to send a focus gesture and then if you trace out a circle in the air with your hand the onscreen circle will follow your hand as seen below. In other words if you draw a clockwise circle in the air, the clock hand will also spin clockwise and vice versa. For some reason, this appears to be annoyingly inconsistent.

Sample-SceneAnalysis
This seems to just do player detection without skeleton tracking:
UPDATE: Check here if you've gotten NITE working and want to try using the kinect as a Minority Report style mouse.
UPDATE:I've added a description of how to track multiple hands under the Sample-PointViewer description.
Here's how I got things working in gentoo:
(1)
mkdir ~/kinect && cd ~/kinect(2)
git clone https://github.com/OpenNI/OpenNI.git (3)
cd OpenNI/Platform/Linux-x86/Build(4)
make && sudo make install(5)
cd ~/kinect/ (6)
git clone https://github.com/boilerbots/Sensor.git(7)
cd Sensor(8)
git checkout kinect(9)
cd Platform/Linux-x86/Build(10)
make && sudo make install(11) go to this page at openNI to download the latest NITE release for your platform: NITE download page or for the impatient:
32-bit
64-bit
UPDATE: download links now point to openNI and should work again
(12)Save the NITE tarball to ~/kinect and untar it
(13)
cd ~/kinect/NITE/Nite-1.3.0.17/Data (14)Open Sample-User.xml and replace the existing License line with the line below:
NOTE: this is case sensitive!
< License vendor="PrimeSense" key="0KOIk2JeIBYClPWVnMoRKn5cdY4="/>
(15)Repeat step 14 for Sample-Scene.xml and Sample-Tracking.xml
(16)Open Sample-User.xml and replace the existing MapOutputMode line with the line below.
NOTE: this is case sensitive!
< MapOutputMode xRes="640" yRes="480" FPS="30"/>
(17)Repeat step 16 for Sample-Scene.xml and Sample-Tracking.xml
(18)
niLicense PrimeSense 0KOIk2JeIBYClPWVnMoRKn5cdY4=(19)
cd ~/kinect/NITE/Nite-1.3.0.17/(20)
sudo ./install.bash(21)
make && sudo make install(22)
cd ~/kinect/NITE/Nite-1.3.0.17/Samples/BinNow finally you should be sitting in a directory with all the sample binaries that you can play with. Here's what they should look like:
Sample-TrackPad:
This app will track your hand and show it's relative position on a grid. Run it and wave your hand,one of the squares on the grid should turn yellow to indicate your hand's location as seen below:

you should also get some debug output in your console:

Sample-Players
This app demonstrates the skeletal tracking. After starting it up, move around or wave until your body changes to blue (subsequent players will be other colors, e.g. player 2 is green, 3 yellow,etc.). At this point your viewer window should look vaguely like this:

and you should see something like this in your console:
Look for pose
Found pose "Psi" for user 1
Now, hold your arms out to your sides bent 90 degrees at the elbows as shown below until a skeleton is overlayed on the image of your body:

At this point something like this should have appeared in your console:
Calibration started
Calibration done [1] successfully
Writing 217.596 50 50 78.4388 64.6762
Matching for existing calibration
Read 217.596 50 50 78.4388 64.6762
Sample-SingleControl
This seems to do some sort of gesture recognition and dynamically adjusts the camera resolution, so it's probably zooming in on an area of interest. When it starts out it asks you to perform a focus gesture. The NITE documentation doesn't seem to define what this would be but simply sticking one hand out in front of you seems to make it happy and you'll see the following output:

SamplePointViewer
This app does handtracking. UPDATE: to allow multiple hands to be tracked you will need to edit /usr/etc/primesense/XnVHandGenerator/Nite.ini by uncommenting the two config parameters it contains. Basically remove the semicolons at the start of each line so that Nite.ini looks like this:
[HandTrackerManager]
AllowMultipleHands=1
TrackAdditionalHands=1
To persistently track different hands in your code you can make use of the XnVHandPointContext.nID in your OnPointUpdate callback.

Sample-Boxes
This example allows you to click one of three boxes, your hand motion is tracked by a slider and depending on the context, up, left, right gestures will be recognized.

Sample-CircleControl
Wave to make the border of the window turn green. Then I think you need to send a focus gesture and then if you trace out a circle in the air with your hand the onscreen circle will follow your hand as seen below. In other words if you draw a clockwise circle in the air, the clock hand will also spin clockwise and vice versa. For some reason, this appears to be annoyingly inconsistent.

Sample-SceneAnalysis
This seems to just do player detection without skeleton tracking:
Monday, October 18, 2010
The magic keyboard

As I hinted at in an earlier post, the magic trackpad hardware is very well designed and bears a strong resemblance to the Fingerworks series of input devices. Now, I've decided to extend the functionality of this device by building a wireless multitouch keyboard using two magic trackpads. Currently, this keyboard will only work in linux since it relies on my extensive modifications to the linux kernel driver for the magic trackpad.
The first image above shows my two magic trackpads with plastic overlays to indicate key placement. It turns out the magic trackpad will still detect contact through a thin insulator placed on top of its surface. The overlays are simply standard laser-printable overhead projector transparencies. I originally considered laser etching the surface of the trackpads at TechShop, but I think I will hold off until I've settled on a key arrangement that I like. Here's a picture showing how I originally planned to lay out the keys relative to my hand.

For now, I opted against the layout above in order to simplify the code that converts coordinates to keycodes. The layout I'm currently using consists simply of three concentric circles, so the code simply checks which circles contain the current touch to determine the corresponding "row" and then compares the X coordinate to a table of thresholds to determine the "column". I created the layout seen in the other pictures using Illustrator. I'll post links to PDFs for the left and right halves in case anyone feels like trying this out.
The keyboard was intended to be placed on the lap to minimize elbow flexion. To hold the halves together I used two strips of rubberized truck tarp (very strong stuff used for a windsurfing roof rack). I'm still working out how to make this attachment adjustable without being bulky.

The transparent blobs seen on the home row of the keyboard in the previous images is clear nail polish. A few coats of this allowed creating a raised ridge to facilitate finding the home row while touch typing. This is actually the principal drawback of a multitouch keyboard- the inherent lack of tactile feedback. The nail polish helps somewhat, but I'm still thinking of other ways to mitigate this issue. The image below shows the nail polish blobs close up.

I currently have preliminary mixed typing/mousing support and a few simple gestures. I also implemented certain behaviors which I think make the use of a touchpad far more ergonomic:
(1)Moving the cursor and single clicking is done with two fingers rather than one.
This helps eliminate the annoyance of inadvertent cursor movement or clicks when a single finger alights on the trackpad. I also think it's more ergonomic, since for me using a single finger requires more effort than using two.
(2)Double clicking is done with a three finger tap- this eliminates the hassle of properly timing a double click.
(3)Scrolling is done with four fingers instead of two. This allows just plopping your entire hand down on the trackpad to scroll rather than contorting it to extend only two fingers.
Todo list:
Software:
Add support for modifiers (i.e. shift,ctrl,alt,meta) using chords.
Write a GUI to rearrange key placement which generates suitable header files based on user's choices. Perhaps eventually come up with a means to allow reconfiguration on the fly.
Hardware:
Come up with better attachment and adjustment system for the straps which connect the two halves of the keyboard
To give some context for this project- here's some images of the devices the magic trackpad derives from - the iGesture and touchstream. You can see that the magic trackpad is significantly smaller than both the iGesture and touchstream which is why I think that modifiers will have to be handled only via chords (no room for separate keys). In the images below there's a weird blob on the USB cords of the iGesture and touchstream- those are my improvised hot-glue strain reliefs.


Monday, August 30, 2010
Magic Trackpad using ten fingers and with gentoo linux support
I picked up a magic trackpad this past friday and decided to see what it was capable of under linux. The video above shows that the magic trackpad hardware provides a great deal of information to the host operating system. As I'd hoped it seems to be nearly as capable as a fingerworks igesture or touchpad in some regards. In the video you can see that the trackpad is able to: detect 10 fingers, track finger contact elliptical size (i.e. along two axes) and orientation, do all the above smoothly at a high sample rate.
The video was created by streaming the debugfs file entry corresponding to the trackpad into a pygame application. The pygame code parses the apple protocol packets to determine ellipse sizes, positions and orientations and then blits them to the screen on transparent surfaces. This code is by no means pretty, in fact "quick hack" is probably a better description. Any suggestions, enhancements, criticisms,etc. are welcome.
If you want to try the code below yourself, you first need to make sure you have debugfs set up properly. Assuming you enabled debugfs when building your kernel, make sure debugfs is mounted and if it's not, mount it with something like:
mount -t debugfs none /sys/kernel/debug/Now, if you look in /sys/kernel/debug/hid/ you should see a directory whose name corresponds to the address of your magic trackpad. If you switch into this directory you'll see an events file which you can then read with cat, tail, or the python script below.
#!/usr/bin/python
import pygame
import re
from numpy import int32
import ctypes
import time
w=1280
h=1024
#this is the debugfs entry for the trackpad, you can figure
#this out by watching the system log when you connect the trackpad
fd = open('/sys/kernel/debug/hid/0005:05AC:030E.0008/events','r')
screen = pygame.display.set_mode((w,h))
pygame.font.init()
font1 = pygame.font.match_font('mikachan')
mikaFont = pygame.font.Font(font1,28)
for index in range(20000):
oline=fd.readline();
line=oline.split();
#repaint black background
#if you move this outside the loop, you get
#fingerpainting-like behaviour, i.e. fingers leave tracks
screen.fill((0,0,0))
packets=[]
if((len(line)<4) or (line[0]!= "report")):
continue
#strip out everything but digits
reportsize=re.sub(r'\D',"",line[2])
header=line[5:9]
#handle the double packet message ID
if(int(header[0],16)==0xf7):
firstpktlength=int(header[1],16)
numpackets=2
packets.append(line[7:7+firstpktlength])
packets.append(line[firstpktlength+7:(len(line))])
elif(int(header[0],16)==0x28):
packets.append(line[5:(len(line))])
numpackets=1
if(len(packets)<=0):
continue
for ipkt in range(0,numpackets):
pkt=packets[ipkt]
data=pkt[4:len(pkt)]
numtouches=len(data)/9;
text1 = mikaFont.render(str(numtouches),1,(255,0,0),(0,0,0))
screen.blit(text1,(200,200))
for itouch in range(0,(numtouches)):
color=(0,0,100)
if(itouch%5==0):
color=(0,0,255)
if(itouch%5==1):
color=(0,255,0)
if(itouch%5==2):
color=(255,0,0)
tdata=data[itouch*9:(itouch+1)*9+1]
if(len(tdata)<9):
continue
X=(((int(tdata[1],16)&0x1F)<<27) | (int(int(tdata[0],16)<<19)))
Y=((((int(tdata[3],16)&0x3)<<30) | (int(tdata[2],16)<<22)|(int(tdata[1],16)<<14))>>19)
X=(((X>>19)))
X=ctypes.c_int32(X).value
X=X&0x1FFF
Y=Y&0x1FFF
#handle 2's complement
if(X&0x1000):
X=X-8192
if(Y&0x1000):
Y=Y-8192
#invert Y
Y=-Y
#push coordinates up into positive range
#then rescale to fit inside our pygame window
X=(X+3499)/6
Y=(Y+2856)/6
radius=int(tdata[6],16)&0x3f
orientation=(int(tdata[7],16)>>2)-32
touch_major=int(tdata[4],16)
touch_minor=int(tdata[5],16)
if((radius==0) or (touch_minor==0) or (touch_major==0)):
continue
if (radius>60):
radius=60
#if((int(tdata[8],16)&0xf0)==0x40): #only display fingers in drag state
if(1):
surface=pygame.Surface((touch_minor*2,touch_major*2))
surface.set_colorkey((0,0,0))
pygame.draw.ellipse(surface, color, (0,0,2*touch_minor,2*touch_major))
surface=pygame.transform.rotate(surface,orientation)
screen.blit(surface,(X,Y))
pygame.display.flip()
fd.close
Note that support for the magic trackpad is still in a very preliminary stage, so getting things working is fairly involved. I'll describe how to play around with the hardware at a low level in the description below.
First, start with a kernel new enough to have magic-mouse support. In my case, I chose gentoo-2.6.35-r5. Now grab the multitouch branch of Chase Douglas's debian git repository. You'll need to copy or merge several files from Chase's kernel source to your own:
/drivers/hid/hid-magicmouse.c
/drivers/hid/hid-core.c
/drivers/hid/hid-ids.h
possibly a few others
If you feel like using the same kernel as Chase, just skip straight to compiling his source tree. Once the kernel is built and you've booted into it, it's time to play around with the trackpad.
Depending on the version of bluez you're using, the procedure for pairing to an HID device will vary. In bluez 3.32 you would first set the trackpad as a trusted device using DBUS, which you could either do programmatically or using a graphical tool like d-feet to call the method
/org/bluez/hci0/org.bluez.Adapter.SetTrusted("bluetooth address of your trackpad").Once the device is set as trusted you would need to actually setup a pairing as root:
passkey-agent --default 0000then use d-feet or DBUS CLI to call CreateDevice("BluetoothAddress"). At this point if you call ListDevices() you should a new device corresponding to the trackpad.
Once you've paired with the trackpad and loaded the hid-magicmouse kernel module you should see messages in the system log indicating that a new input device has been registered.
input: Apple Wireless Trackpad as /class/input/input6
magicmouse 0005:05AC:030E.0005: input,hidraw4: BLUETOOTH HID v1.60 Mouse [Apple Wireless Trackpad] on 00:27:48:09:63:60
input: Apple Wireless Trackpad as /class/input/input7
Note that you'll also want an up to date installation of evdev, otherwise you might get messages like this:
evdev.c(EVIOCGBIT): Suspicious buffer size 511, limiting output to 64 bytes. See http://userweb.kernel.org/~dtor/eviocgbit-bug.html
If you want to see the touch reports the trackpad produces you can use the evtest application. In gentoo this is provided by the joystick ebuild. Now if you do
]$ evtest /dev/input/event6
Input driver version is 1.0.0
Input device ID: bus 0x5 vendor 0x5ac product 0x30e version 0x160
Input device name: "Apple Wireless Trackpad"
Supported events:
Event type 0 (Sync)
Event type 1 (Key)
Event code 272 (LeftBtn)
Event code 325 (ToolFinger)
Event code 330 (Touch)
Event code 333 (Tool Doubletap)
Event code 334 (Tool Tripletap)
Event code 335 (?)
Event type 3 (Absolute)
Event code 0 (X)
Value 3097
Min -2909
Max 3167
Event code 1 (Y)
Value 2238
Min -2456
Max 2565
Event code 48 (?)
Value 0
Min 0
Max 255
Event code 49 (?)
Value 0
Min 0
Max 255
Event code 52 (?)
Value 0
Min -32
Max 31
Event code 53 (?)
Value 0
Min -2909
Max 3167
Event code 54 (?)
Value 0
Min -2456
Max 2565
Event code 57 (?)
Value 0
Min 0
Max 15
Event type 4 (Misc)
Event code 3 (RawData)
Testing ... (interrupt to exit)
Now if you touch with two fingers you'll see a slew of output which should include something like:
Event: time 1283194689.870573, -------------- Config Sync ------------
Event: time 1283194689.870586, type 1 (Key), code 330 (Touch), value 0
Event: time 1283194689.870587, type 1 (Key), code 333 (Tool Doubletap), value 0
Event: time 1283194689.870589, -------------- Report Sync ------------
Subscribe to:
Posts (Atom)