## Wednesday, September 29, 2010

### Eugene Charniak's talk and my understanding

I was definitely not a linguistic guy and I seriously doubt if I am now. First I never read a whole NLP paper seriously; Second I totally have no idea about those NLP masters.

No doubt that I never heard about Prof. Charniak before till today's talk, who is apparently a well-known senior researcher in NLP. He came to Maryland to deliver a talk on Top-Down Nearly Context-Sensitive Parsing. What is Top-Down Parsing? From my point of view, different from previous bottom-up parsing method, which starts parsing from the "leaves" of the parsing tree, top-down parsing uses psychological plausible way to do parsing word by word.

The main motivation of his work is that: (1) People understand a sentence long before it ends; (2) Indeed, evidence suggests we understand a sentence on a word by word basis; (3) Syntactic trees specify semantic roles, therefore we must have a fully-connected tree at all times.

He also listed two main drawbacks of CFG:
(1) CFG restrict us to using dynamic programming. Dynamic programming is great as it can largely reduce computational complexity, but in order to get the benefits, we have make some sacrifices. One is that lot of different parsing trees have to be put into same dynamic variables.
(2) We have to do smoothing, but somehow smoothing is very bad.

He then introduce a little bit about a method based on random forests, which can record much more conditioning information when processing data.

$P(prices | n-plural) = 0.013$
$P(prices | n-plural, NP) = 0.013$
$P(prices | n-plural, NP,S) = 0.025$
$P(prices | n-plural, NP,S, V-past) = 0.052$
$P(prices | n-plural, NP,S, V-past, fell) = 0.146$

Random forests is not new to me, and I kind of get tired with those claims that computer scientists have to consider psychological plausible or biological plausible way into computer science problems. Yes, to understanding natural languages, human beings can do a way much better job than the programs nowadays, but still is it a adequate reason that machine should parse sentence like human beings do?

The problem is: A Biological or Psychological Plausible way is always a Computational In-plausible way!

When it comes to top-down method, actually it is also a hot topic in Computer Vision area. Traditionally, CV researchers construct algorithms from bottom-up manner: Pixel->Region->Region Group->...   The research on Visual Attention is a typical example. Many bottom-up methods have been proposed, and several years ago people starts to focus on top-down approaches. Top-down approaches requires us to deal with semantic meanings. Rules such as people tends to pay attention on faces and cars---Those objects they saw before or impressed them before (however those object detection algorithms are still bottom-up approaches. Then how can we model those "Experiences"? Does "more conditioning information" solve this problem?

Ok, I have to go back to the Earth. Had a long but effective discussion with ML project mates today. One of them is working on getting better object recognition by using spatial information. The final conclusion is that we can try to use the output of the recognition program as training data, to train a classifier to label an image into Object parts and Non-object parts. Eh... I admit it is a very normal idea, what excites me is that, though the recognition algorithm can only recognize 21 objects (cow, car...) and so-called non-objects (sky, grass..), what we expect is a classifier works for any objects and non-objects. Well... It sounds interesting to me, at least now~

BTW, I suddenly feel that language parsing problem is like image segmentation problem in CV... Is it?

## Tuesday, September 28, 2010

### ”How I met your mother"

When your father tells you "how I met your mother." What do you think? A "younger father" might jump into your mind, as well as a "younger mother". They met each other, in a romantic atmosphere.

The point is that, human beings can easily map language to an image or an piece of video. It's like the work of The Impressionism. The image might be lack of details. It is not real vision, it is virtual.

Can we find a method to generate these kinds of virtual vision?

The Starry Night (1889)
Vincent van Gogh

Machine Learning Class comes to the Neural networks' part. I think ANN is a great example of how researchers get a kind of success by compare computational system with biological human system, and how finally they realized that it actually almost has no relationship with human system at all. Biological thinking can inspire us, and it always sounds fantastic, but is it a solid way of thinking?
Let's talk about the name of "Machine Learning". From my point of view, as statistical methods rule this area nowadays, the name should be changed to "Large Scale Data Analyse"... The word "learning" from biological or psychological view does inspire the pioneers in this area, but in fact, we are doing other stuffs, much different from "learning"...

## Monday, September 27, 2010

### Tex Test

$\int_{0}^{1}\frac{x^{4}\left(1-x\right)^{4}}{1+x^{2}}dx=\frac{22}{7}-\pi$

### Thinking in Vision and Language

An interesting discussion went on today in our group. The core problem is around "Stone Lion". State-of-the-arts methods on object recognition can train a classifier to recognize the "lion", but how to discriminate a "stone lion" with a "real lion"?

So it seems that nowadays vision programs can solve the problem of "which is it?", but really can not answer the question about "how about it, tall? short? material? etc..."

From my point of view, it really depends on in which level the intelligent system is. For example, human can easily discriminate a scarecrow from a real man, but birds can not. That's why it can scare birds away.

Another interesting problem haunting me these days is that from Linguistics class project 1, I mis-implement the HMM algorithm with back-tracking, but it gives out better result than the right implementation. Apparently, without back-tracking, I am using a Greedy path selection from HMM. It seems that Greedy algorithm performs better than the optimal selection one... Weird...

Tested several recognition filter from VOC on ESG dataset. The result is not bad. But, still, for example, people will label car even the picture is taken in a car. Apparently the filters failed to detect car in those pictures.

ICPR?  International Conference on Pattern Recognition? No.....

