Friday, December 15, 2006

It's 2 AM and I'm wondering what I'm doing posting to my blog, but after you've been staying up to 2 or 3 AM for several days, what's another 20 minutes? At least I'll sleep in my bed tonight, unlike last night--rather than go home at 3 AM only to come back for a 9:30 AM class, I used my backpack as a pillow, threw my coat over me, and slept under the desk in my research lab. I slept through the janitor emptying the trash, the network guy resetting a switch in our lab, and my advisor coming through, waking up 5 minutes before class started.

Why the late hours? Today was the last day of Fall semester, meaning that final projects were due, and I slid my last one under my teacher's door ten minutes ago. Now, I just have my two finals to take; one Monday, and one Tuesday. After that, I'm free for the Christmas break!

Desperate to get away from school for a few hours, I went by the animal shelter, where I got pressed into giving vaccination shots to a group (covey? conglomeration? herd?) of cats. These were the cats that were getting moved from "strays" to "adoption," and we had to bring them up to date on their health records. I've given a shot once or twice before, but this was my first time doing it en masse. And I only shot myself once while doing it.

We also had an interesting conversation about decapitation. Two police officers from different towns stopped by at the same time, both dropping off stray dogs; and they, the shelter director (an ex-cop), and the shelter supervisor (an otherwise no-nonsense lady with a soft spot for cats) starting discussing procedures when an animal was brought in that had attacked someone. Animals that attack humans have to be checked for rabies, and since they're scheduled for euthanasia anyway, you can do two things.

One, you can keep them for 10 days, watching for signs of rabies, then have a vet come by to check them before euthanizing them. Two, you can immediately euthanize them, cut off their head, and send it to the state lab for the brain to be analyzed for rabies. It's a trade-off: in the first method, you have to use up resources taking care of a dog that will be euthanized anyway; in the second, you have to...well, cut off their head. It's not a job for everyone.

The shelter director was saying that he was always willing to do the job (decapitation) himself, he never forced one of the shelter employees to do it if they were unwilling. He was going on about how hard it was, and how few people had the ability to do it, and the shelter supervisor was laughing because she and I had decapitated a cat only a few days earlier.

"Are you squeamish?" she had asked me, walking past.

Be careful how you answer that question. Next thing I knew, I was hold a dead, stiff cat (it had been in the cooler) while she wielded a giant pair of pruning shears. It was surprisingly clean and simple, one snip and it was done, and there was no blood since the cat was a cool 42 degrees.

Well, enough about decapitation. Back on the school note, I left a program running on my work computer that is supposedly trying to figure out a good equation for guessing movie ratings. Netflix is offering a million dollar prize to anyone who can develop a program that recommends movies better than theirs, and I chose that problem as my semester-project in my machine learning class.

I don't have any expectations of winning, or even coming close--and the final report of the project was the one I already turned in--but the problem is intriguing and I want to leave my program running a few days longer, to see how it fares. We'll see. If I get around to it, I'll try to post a high-level description of my approach. Not that it's interesting to anyone but me, but hey! it's my blog!


K La said...

the noun for multiple cats is clowder

Xirax said...

group (covey? conglomeration? herd?) of cats

I left a program running on my work computer that is supposedly trying to figure out a good equation for guessing movie ratings.
Where does it get the data, and what kind of data is it? What ML algorithm do you use?

The Writer said...

"Clowder"? Wow. I never would have guessed. That sounds more like a quick way to say "clam chowder." Anyway, I looked it up, and that's the regular word, but according to, "clutter" is another word. I like that one. ;)

And Xirax, I'm using a genetic algorithm, where each member of the population is an equation. Each generation, the equations that calculate the most accurate ratings are kept and mutated (operands changed, operator change, etc.) in an attempt to improve the equations even more.

Xirax said...

I took ML, don't worry :)

What are your parameters in an equation and where do you take the data from (some movie database, directly from the web)? How do you represent an equation? We evolved a neural network, but ours only allowed linear functions. And how do you evaluate which equation is better (forgot the term here)?

The Writer said...

Representation: my equations are controlled by a grammer. An equation is an operand, followed by an operator (+, -, *, /), followed by another operand. An operand can be a constant real value, one of five metrics/variables, or another equation.

The data was actually released by Netflix: 100+ million movie ratings from 480+ thousand users over 18,000 movies.

Fitness: I determine the fitness of each equation by using it to calculate the RMSE (root mean square error) between my prediction of a user's rating for a movie, and the user's actual rating. I do this 100 times. The 45 percent of my population/equations that give the best score are mutated and continue on, while the rest die.