My main passion is the interaction between humans and computers, improving the way computers understand us and employing different technologies in order to bring more knowledge to more people.
Mainly, algorithms and concepts concerning with data, its organization and how to make it more accessible and usable. One of the things computer can do better than humans is to keep track of millions of data points and make sense out of them. I believe it is important to exploit this ability in order to aid humans by providing better access to better customized content.
I have recently finished my M.Sc degree at the
Computational Linguistics department
Saarland University, Germany.
My thesis (pdf) was done under the supervision of Dr. Valia Kordoni and Dr. Yi Zhang . It focuses on improving the ability of the computer to parse natural language by integrating semantic knowledge using machine learning methods. The work was carried within the Delph-in framework using deep linguistic HPSG grammar formalism. The work was presented in LREC 2010 ( Paper, Sildes)
Although right now it is not the main focus of my work, I am always very interested in Semitic language processing (Hebrew in particular) and the different challenges they poses.
A nice project we have just completed (a joint work with Prof. Miriam Shlesinger, From Bar-Ilan University) is the compilation of a large scale Hebrew corpus, collected from the WWW. The full code (mostly Python and Perl) and some notes can be found here . The corpus itself can be downloaded here (gzip file). Some of those are modifications of scripts provided in the BootCat toolkit)
Another topic I am always willing to discuss is bio-informatics. This domain involves massive amounts of data as well. Extracting relations and information in this domain has special challenges and uses. Here is a report and a suggestion for a baseline for protein Interaction detection.
Here you can read a my cv