Longest Common Subsequence Problem
Sometimes I really have to thank my readers, because they are just as educational to me as I may be to them. Here’s a case in point. Regarding my last post on stopping password theft by key loggers I got a really well thought out email by Steve Smith that I’m just going to have to cut and paste. I should have guessed someone had already encountered this problem before I did and there is some pretty crazy math involved. Here’s his email:
About the hole: it’s a classic computer science problem commonly referred to as the “longest common subsequence problem.” It’s not hard to make a program that works it out, and there’s plenty of code out there already that solves the problem. Interestingly, your examples’ LCS is “snjoopy,” which makes me think that if you were consistent with the extra typed characters that hole would be effectively plugged.
Consider if an attacker, instead of receiving 3 different strings with the password embedded in them, received 3 strings that were exactly the same. The LCS would simply be the entire string, and if the attacker wanted to try to brute-force it there’d be 2^n possible passwords (n being the length of the string containing the password). Your examples had lengths of around 25, so that’d be a pretty large search space. If the actual password were trivial (a single word or name, for example), the attacker could first brute force against a dictionary to reduce the search space, but anything non-trivial would be effectively protected, as far as I can see.
Granted, that level of password protection is pretty close to tin-foil-hat level of paranoid, but it makes me wonder if it would be possible for web browsers to simulate key presses in order to automate this kind of security. It could generate a hash of the URL (or somehow generate extra characters in a consistent way) and simulate key presses in between actual key presses during password entry to obfuscate the password. It would protect against key loggers and prevent the kind of analysis you described from figuring out the password from the obfuscated string. The only hitch I can think of would be how to handle when a user backspaces over part of the password and re-enters it.
It seems like a good idea, but I don’t really have any experience in this area to know if such functionality would even be possible.
Wikipedia has a good article on the LCS problem (http://en.wikipedia.org/wiki/Longest_common_subsequence_problem) if you’re interested. Also, if you want I can send you the Java code I made to solve the problem. I whipped it up just to quickly figure out the solution for your set of strings, but even without any attention paid to optimization it found the solution in a second or two.
Very informative, thank you Steve. And I think this proves an interesting point that a) the LCS for the strings are confused by consistency and b) it’s trivial to find the results in just a few seconds. What more can I say?



November 26th, 2006 at 6:06 pm
well actually.. the number of distinct brute forcing permutations is quite complicated to answer.. and not as simple as 2^n
It’s been a few semesters since i’ve taken discrete mathematics.. and a couple years since discrete structures - so i don’t want to have to recall it all ^^; .. but it’ll involve multiple factorials and will have to remove non-distinct ones after (caused by multiple copies of the same letter)
it does make the keyspace drastically smaller, but that’s not really the purpose of a key logger .-.
honestly though, i think it’s a bit of a moot point cuz i don’t think it’s a practice anyone will follow. clever solution nonetheless..
March 19th, 2007 at 2:10 am
please send me the java code you used to solve the longest common subsequence problem.
October 27th, 2007 at 1:01 pm
please send me the java code you used to solve the longest common subsequence problem.