December 21, 2004

Entering ^<a\s+href\s*.... with a pen

Over on Scoble's blog Martin Alderson asks if it's even possible to write strings like "^<a\s+href\s*=\s*"http:/\/([^"]*)"([^>]*)>(.*?(?=</a>))</ a>$" with a pen.

Well, it is possible, but I wouldn't recommend it. Typing is faster. In fact, that's an important point to remember about Tablet PCs (particularly convertibles): You can use them as regular notebooks. I type all the time with my Toshiba M200 Tablet PC. I use the pen too--when it's appropriate.

But back to Martin's question…What's it like to enter a string like that above with the pen? I didn't know. I'd never tried anything similar. So I decided it was time for an experiment.

I tried typing the string Martin posted as well as entering it with the pen. There are several different ways to enter text with the pen, so I tried each of them. I'll explain each one below. For each test I simply entered the string into Notepad. Nothing fancy. Here's a screenshot of the results:

CodingTest.gif

The first time I typed the string in 1:20. It took me surprisingly long. Even though I can type code relatively fast I was tripped up by the awkward syntax. I also realized I was typing in a room too dark. It was early morning and the sun hadn't come up so I could barely make out the keys on the keyboard. I can type letters fine, but I use so many different keyboards that the symbols kept tripping me up. I needed to turn on the light as I did later.

Next up I entered the text string using the pen and the onscreen keyboard:

OnscreenKeyboardSmall.gif

I simply tapped the letters with the pen. That took 1:35. Interestingly, much of my time was wasted toggling the shift key and then the various symbols. This created many double taps that I could have avoided if I had thought about it. Now, there are other keypad designs that people have developed to improve onscreen keyboard speed. One from Tom Clarkson even tries to combine ink recognition with classic keys.

I bet Martin would baulk at entering text on the onscreen keyboard—although if I have to edit some code with a pen I do use the onscreen keyboard this way. There are some alternatives closer to what I think Martin is expecting.

So next up I used the comb (or character) mode of the Tablet Input Panel. (The Tablet Input Panel supports the onscreen keyboard shown above, a comb mode, and a freestyle handwriting pad.) In the comb mode, each letter or symbol is written in its own cell:

UsingCombSmall.gif

The cells help to improve the segmentation of the characters and in turn the recognition. I was surprised how fast I was able to enter the string with the comb mode: 1:05. How can this be faster than typing? Oh, yeah, I was typing where it was too dark. (Tablet enthusiasts will nod their heads at this: with a Tablet you'll find yourself using your computer in more places, like in the dark where you're doing little or light typing--or lots of inking.)

To be fair, I turned on the light and tried typing again. Yep it was faster this time: 1.05. And just to double check that the comb timing wasn’t bogus I tried it again. This time around it took 1:15. I made a lot of mistakes this time around so my time suffered. I also realized I was being slowed down by having to look up to the original string and back down to each cell that I was writing in. When you’re touch typing (except for me in the dark :-) ), you don't pay this penalty.

I tried one more time with the comb mode of the input pane. This time I tried not to back and forth. And the change worked. The time was 1:00. But, oops, I made a mistake.

OK. I'm guessing that this comb mode isn’t what Martin was expecting either. And I'm not going to try using graffiti input either. Instead I switched to the freestyle input mode and penned the text string. The total time was the fastest: 40 seconds. It blew away my typing. The problem? It didn't recognize much of anything either:

FreestyleHandwritingCode.gif

The issue is that the recognizer is dictionary oriented. The Tablet PC recognizer matches the handwritten ink against words that it knows. It doesn't know the syntax of a computer language like that shown here. Although, notice that it was able to pull out the "http:" because this is a known combination of characters. However, it also coalesced "^<a\s+" into "east." If you squint at the input string you can kind of guess why this happened. The '<' was recognized as an 'e' and the plus sign as the letter 't'. If you ignore the caret and slash symbols, the resulting word is "east." Of course, it's wrong, but it's a reasonable guess.

Clearly, this guess isn't good enough. The problem is that the built in recognizer doesn't know anything about the coding syntax required. But what if it did? And this is where Service Pack 2 (SP2) kicks in. New to SP2 is a programmatic interface to control the input panel recognizer. You can tell it that you're expecting a zip code or a URL for instance, and the quality of the recognition will go way up.

Unfortunately, the control you have over the recognizer context isn't powerful enough to express a full computer language. Right now you can pass a regular expression or one of several predefined context rules to the recognizer. This is fine for recognizing serial numbers, dates, times, names etc in edit fields. Martin's example is more problematic.

But just to show how powerful recognition context can be, I decided to code up an example that defines a regular expression that looks for the magic string. (No way is this acceptable in the general case, but just suspend disbelief for a minute to see how well the recognition can work.)

I took the AdvReco example from the Tablet PC SDK and added the following context string to match against:

"\\^<a\\\\s\\+href\\\\s\\*\\\\s\\*\"http:"


(In particular, I inserted:

L"\\^<a\\\\s\\+href\\\\s\\*\\\\s\\*\"http:",

Just before the last item in the declaration of the gc_pwsInputScopes[] array.) That's the only change I made. In English, it says to look for the character sequence:

^<a\s+href\s*\s*"http:

This is the first several characters of Martin's sequence. Since this is a kludge, there's no reason to look for the whole string I figured.

After compiling and launching AdvReco, I went to the Inputscope menu to select the input scope string I added. Now new input will look for this string. And then I started handwriting the magic code as above. In the screenshot shown below, the top part of the window is my handwriting and the bottom shows various possible recognition results for the handwritten string. The top value is the "best" match—-the exact string we are expecting.

HandwrttenCodeStringWithInputScopeSmall.gif


OK. Again, this is somewhat of a straw man test, but it illustrates that SP2's leveraging of context can significantly improve the recognition. This is a big payoff to Tablet users.

I'm also not saying that this is the way to code on a Tablet—even if handwriting was successfully recognized. But it doesn’t take much to imagine a context aware recognizer that understands the syntax of the language as well as the dictionary of variables and the like helping to constrain the recognizer. Think IntelliSense meets ink.

What's all this mean? Is it practical to write code with a pen? Not unless you have to. I type faster, for instance. But, I do use the pen a lot when sketching out program flow or marking up a screenshot with changes I’d like to make. Maybe someday ink will be integrated into the IDE. Till then, I'll stick with using the pen in Journal and OneNote while thinking and the keyboard when coding in the IDE.

Posted by Loren at December 21, 2004 12:33 PM
Comments

Read Dvorak's opinion on the presumtive replacement of keyboards.
http://www.dvorak.org/blog/columns/7-12-04.html
(Clue: They aren't going away. Live with it!)

Posted by: Luci Sandor on December 21, 2004 03:59 PM

Exactly! I use the keyboard a lot on my Tablet PC. The pen is just another way to interact with the computer. It complements the keyboard, mouse, and touchpad.

In particular, I use the keyboard for coding, emails, writing 99% of this blog's posts, and more. The keyboard is crucial to the way I work.

I use a pen when thinking through designs, marking up screenshots, and drawing for fun.

It just depends what I'm doing.

Posted by: Loren on December 21, 2004 04:42 PM

Loren, this is a pretty interesting experiment but your advanced reco changes are not a valid experiment. The problem is you geared the input scope towards this particular regex. Since regex's character ranges are so wide, it would be useless to try to make an input scope for it. You'd need to turn off context altogether and disable the dictionary (which is of course possible.) Have you tried that?

What would be nice is an add in for Visual Studio that looked like the on screen keyboard except that it inserted common constructs such as if, try, catch, etc. Also select a block of code and insert brace pairs around it.

Coding using the pen sucks balls but sometimes, especially while debugging, it's what we do.

Posted by: Josh Einstein on December 21, 2004 05:08 PM

Josh, the inputscope test isn't fair for the reasons you stated. It's a kludge. I wanted to try something simple to illustrate the value of SP2's inputscope even with a complicated mess of symbols.

I should come up with a better example that shows how the inputscope can be used in the real world.

Posted by: Loren on December 21, 2004 05:38 PM

But you essentially limited the inputscope to a single result which means its impossible to fail. Even if you wrote a bunch of chicken scratch it would coerce it to the desired result. A more interesting example would be an inputscope for HTML tags in general.

Now of course that would be a gigantic input scope and probably exceed a limitation. So now I would like to see support for regexp's in the tablet pc dictionary. That would be neat!

Posted by: Josh Einstein on December 21, 2004 07:57 PM

(Okay all my sample html tags just got eaten when I posted, but you get the picture...)

Posted by: Josh Einstein on December 21, 2004 07:57 PM

I can't wait for Perl 6 regexps which are infinitely more readable...

Posted by: Maurits on December 22, 2004 07:58 AM
Post a comment