GSoC : Community bonding

The community bonding period of this year’s google summer of code is nearing an end. Its been a rather busy week, and I had to juggle time between exam preps and GsoC. I cannot say that I have made much progress. However, an IRC meeting with the mentor turned out to be very fruitful. It was about setting up the right development environment, and I did learn a lot!

1. ctags/etags : I was complaining how hard it is to find function definitions in the libvarnam codebase. There are a lot of header files. That’s when I heard about ctags. I had to install the ctags package from the ubuntu repositories, and configure it to catalogue the libvarnam folder. Then I got myself the sublime text editor and installed the plugin for ctags. Now all I have to do is press ctrl+t+t when I encounter a function call and sublime will open the the definition of that function in a separate tab! Productivity multiplied – ten fold!

Another convenient way (though not as convenient) would be to use grep -iR. The -iR argument makes grep list the files from which the pattern matches were found.

2. Nemiver : I have used the gnu debugger (gdb) in my lab before. The programs I wrote then were rather small and I could live without a debugger. But mentor says no. Nemiver is a rather neat front end to gdb and I don’t have to look up line numbers to insert break points anymore – I click on the line instead. Also, nemiver makes the print command in gdb quite obsolete. Nemiver shows the values of all the variables in the scope as a list.

3. Sample project : My first task. In order to get myself familiar with libvarnam and learn some debugging in the process, the mentor asked me to write a sample project. My sample program, found here, would convert all the string literals in a python program into their corresponding Malayalam equivalent. Simple and buggy. But I did learn how to make nemiver branch into the libvarnam API and do some transliteration.

Now that I’m getting a few days gap before the last exam, I must fix a bug or two. I hope I’ll be able to start working on the stemming algorithm starting May 20th.

Google Summer of Code!

I’m excited to announce that I’ve been selected to this year’s google summer of code. My mentoring organization is SMC – Swathantra Malayalam Computing and I will be working on the varnam project.

Varnam means ‘colors’. Varnam is a transliterator for indic languages. My task is to improve the learning capability of varnam by coming up with a stemmer algorithm for indic languages. A stemmer algorithm returns a base word when it is supplied a complex word. In english, supplying ‘retirement’ to the porter stemmer algorithm will trim it down to ‘retire’ and subsequently return ‘retir’. I have to do the same thing with malayalam words. The trick is to design the whole thing in such a way that stemming support for other languages can be easily added. The stemming rules will differ from language to language. Though I will be laying down the rules for malayalam, I should provide room for someone else if she decides to add support for another language. In short, my algorithm should be designed to read a ‘rule file’.

The varnam project can be found here. Why use varnam when you have, say, google input tools? For one, google input tools work only in windows. Two, I’m not sure if you can use it in your own programs. I guess not. Three, it is not open source which means google won’t let you take a peek inside. Four, varnam can render the whole linux shell in malayalam if need be (and if you are willing to put in the effort)! To be frank, seeing small round malayalam alphabets on my desktop konsole was quite unexpected!

I’m so grateful to SMC for letting me work on this and even more grateful to google for the upcoming paycheck ;). SMC requires us to keep the blog updated on a weekly basis, so I guess everyone will be hearing an awful lot from me 😀

Getting machines to listen

I’ve been wanting to do this sound localization project for almost an year now. Its simple – have a few microphones ready, yell at it, and display the direction of sound on a screen. And being the electronics newbie I am, I had spent a considerable amount of time wondering how to connect an electret mic to my board/computer.

However, the last few days have been extremely productive and now I understand what exactly goes on when you yell at your laptop mic.

The Task : Record something using the built in mics (or any mic), store it in RAW format, access it using a python program. Understand sound.

The tool : Ladies and Gentlemen, meet ALSA – Advanced Linux Sound Architecture
Another tool : pyaudioalsa. Helps us do the necessary stuff without resorting to the C API.

You already have ALSA if you are using any of the major Gnu/linux distributions. First get pyalsaaudio and install it on your computer.

The pyalsaaudio page have some nice examples as to how to record sound. Go through it. By default, your recordings are in RAW PCM format. PCM stands for pulse code modulation. The output of your recording is actually a set of values that denote the amplitude of the input sound at various points in time. Just go through and try running the examples that come with pyaudioalsa source – record.py and playback.py.

Your recording is saved as a PCM RAW file. Try opening it in a text editor and you will see junk values.

But where are the amplitude values?

Exactly. Before that, we will try playing it back. Of course you can do it using the example program playback.py. But there’s another way. Try this on the terminal :

aplay -r 44100 -f S16_LE -c 1

That command will make ALSA play back the recording for you. 44100 is the sampling rate of the recording. How did we know that? Look inside record.py and you will see that the recording was sampled at 44.1KHz. S16_LE denotes that there are 16 bits per sample in our recording stored in little endian format? Now how did we know that?. Again, check record.py. The ‘-c 1’ tells ALSA that our recording is mono.

So that’s how you playback raw PCM files.

But what if you want to do some signal processing? What if you want to draw one of those spectrums or waveforms and other seemingly complicated stuff? Then you will need to extract the data from the RAW PCM. It sounds complicated, I know. Have no fear, python (and numpy) is here:

import numpy
data = numpy.memmap("test.pcm", dtype='h', mode='r')
print "VALUES:",data

Finally, something human readable!
Why don’t we draw a graph?
import numpy, pylab
data = numpy.memmap("test.pcm", dtype='h', mode='r')
print data
pylab.plot(data)
pylab.show()

Note : I’m very grateful to this guy for showing me how to do this.

So what do we have here? We have successfully made use of the RAW PCM data. So? There are many algorithms in Scipy and numpy that can do amazing things with that data – cross correlation,fft,convolution. Thank God (and Guido Van Rossum) for python!

S-tall-man the tall man

SPACE – Society for the Promotion of Alternative Computing Environments celebrated their 10 years of existence today here at Trivandrum and they invited none other than Richard Stallman to do the talking. I was 13 or 14 years old when my father first told me about Richard Stallman and about the ‘big things’ he were doing with computers. The only impression I got from his photo was that he was a particularly huge man with huge beard and loooong hair. Today, I saw this guy ‘for real’. Mr.Stallman has had a rather warm relationship with the government of Kerala and he had visited the state several times before. After all, this guy had convinced our government to get rid of the windows PCs and move to free software. Thanks to Stallman – I’m not using that retarded Turbo c++ compiler to compile my c++ programs.

Not so huge from where I am sitting!
Not so huge from where I am sitting!
Shake hands for freedom
Shake hands for freedom.

The talk was semi-boring. The people on the dias dozed off while Stallman was buzy giving us a lecture on digital freedom. Neverthless, he did come up with a few new things to say. There was the usual tandrum about security and privacy on the internet and unjust surveillance. He explained that he has never owned a portable (mobile) phone and never will. Said he would rather not use that piece of technology than let someone spy on him. A friend brought to my attention the rather unusual similarity between Stallman’s way of talking and the flavorless, emotionless voice used by Microsoft windows to read things aloud. If I were blind, I would have (could have, to be more precise) mistaken Stallman’s lecture for a windows machine reading text aloud!

Then he moved on to his forte – free software. As usual, he specifically portrayed Microsoft and the ‘ithings’ as evil and what you and me call open source as a lesser evil. Open source was not enough for him, he wanted things to be ‘free’. This raised a few eyebrows, and I think it raised my eyebrows the farthest. But I already knew this from the wikipedia entry on him so I was more or less feigning surprise. Also, as he had been doing for years, he emphasized the necessity and the importance of referring to linux as GNU/linux (pronounced GNU slash linux or GNU plus linux) instead of just linux. He gave a convoluted philosophical explanation to it but it was bull shit. It seems the man has a problem with Linus Torvalds.

Then he said something really meaningful. He said that people have the right to use proprietery software and it is actually OK to do so. “Even though they are hurting themselves”, said Stallman, “they are not doing harm to the society”. But, governments and states are established for the purpose of serving man. Institutions and frameworks with public benefit in mind should never ever use proprietary software. Doing so puts the “national security at risk”. This makes sense. Now that he has said it, I feel victimized when I remember that the only operating system available on our school computers were Microsoft Windows. I feel that me, along with a bunch of other kids, were forced to use commercial software that was just popular but not universal. Since we were learning to program and not to ‘use’ computers it would have been better if we were trained on a *nix machine – at least it is free and can be modified at will. Using only Microsoft windows at school is like having to eat only one flavour of ice cream while you are at school. States should not have the option of using proprietery software. The source code should be able to be examined to make sure that there are no backdoors or security threats than can be engineered to be used for malicious purposes. Good thinking Mr.Stallman 🙂

There was another revelation in there for me – Stallman claimed that Facebook monitors a lot of people – people who don’t even have a Facebook account. The logic was this. When your browser loads a website that contains the ‘Share via Facebook button’, your browser requests the Facebook server to provide it with the blue and white ‘F’ image. Your browser tells the Facebook server that it is for your computer (your ip) and for using at this website. Hurray, Zuckerberg knows which site you are surfing!

As for me, I would go with the open source guys. Stallman and software that is ‘truly liberated’ is an extreme, and I don’t like extremes. Not because extremes are difficult to live with, but because such extremes almost always have little practical value and their advocates are often fanatics who refuse to be convinced otherwise. Neverthless, people like you and me are indebted to Richard Stallman for things like GNU, and drum rolls please – gcc (GNU C Compiler).

P.S – Mr.Stallman held an auction at the hall and sold a ‘gnu’ doll – and I came to know that there is an animal called gnu and it lives in africa. The doll was sold for rs.2500.

Back with some brains!

Its been quite some time since I posted, I admit. The good news is, I’m back. Another good news is, I found something cool!

Gone are the days when the word artificial intelligence suddenly pulls up neural networks to your mind. Actually, gone are the days of artificial intelligence it seems. Cognitive computing is going to be the norm, or at least I hope so. Simply put, cognitivie computing is mimicing the way we humans think and making computers do the same.

“But I thought that was what artificial intelligence was all about”

Yes and no.

Though cognitive computing might actually qualify as a way of implementing intelligent machines, conventional artificial intelligence was problem specific. There was usually a separate “learning phase” where we have to feed tons of data to the supposedly intelligent machine. Cognitive computing is a significant improvement on this considering that these machines can learn online. That is, there is no separate learning phase. The machine learns as it work, just like we humans do.

Numenta is a company that deals with the above said “stuff”. They have built a platform (or is it a software?) called nupic (numenta platform for intelligent computing) which implements something known as hierarchichal temporal memory (HTM). And it uses a cortical learning algorithm (CLA) to mimic human brain. Basically, the nupic functions more or less like how we do.

Enough boring theory.

Visit numenta and nupic here : http://numenta.org/

The nupic is open source and you can get the source on github :

https://github.com/numenta/nupic

A warning though. The nupic has a steep learning curve. So get your hands dirty only if you have some time and patience. I couldn’t run the tests on the build (to check if nupic installed correctly) successfully and is still asking around for solutions (the mailing list is great).

And they have some awesome videos of previous hackathons and example programs that clearly demonstrates the power of nupic. Here’s a link.

Coloring the canvas – the processing way

I thought it was time somebody started painting on the canvas. Its been a couple of months since I wrote this program and it was lying in a corner of my hard disk all along. Realized kevinkoder.tk is pretty low on contents and decided to add it to the site.

http://www.kevinkoder.tk/paint7.htm

And the source code can be viewed by pressing ctrl+u (in chrome). But I must warn you though – I’ve put absolutely zero effort into making the code readable. I’m pretty sure even I can’t make sense of it right now. I was in a hurry to do something with processing and thought “Hey, lets make a paint program”. The development came to an abrupt stand still when i tried to implement the ‘fill bucket’ tool using the flood fill algorithm – the damn thing simply refused to work the way I wanted it to.

Anyways, the program proves beyond all doubt that a lot of amazing things are simply doable with processing.

And I had to choose not to include a lot of additional libraries that might have made it a lot easier to develop the program (like a java library that can add additional drawing layers) because 3rd party processing java libraries are incompatible with processing.js , ie, they wont work when you convert the processing code into javascript.

A few handy ‘features’ were added to compensate for the ultimate pathetic-ness of the program. More info can be found in the documentation.

And the documentation : http://www.kevinkoder.tk/paint_doc.htm

.

Fading menu

Remember me ranting about that free domain name? Well I’ve put it to some good use. Here’s the link to my very first sketch to be hosted on the world wide web :

http://www.kevinkoder.tk/test.htm

Just a few sentences thrown all over the canvas and playing hide and seek in the snow. Nothing much really.  Oh, and here’s the link to the source :

http://www.kevinkoder.tk/data_menu/fading_menu.pde

If you have a slow connection please wait a bit. The music is around 250 kb. Also, text looks a bit drab because I couldn’t get the fonts to work properly with processing.js

And the link to processing tutorial is inside the canvas. Happy hunting 😉

Tokelau Beckons

EDIT: The domain I registered was taken away. I do not know what happened. Probably because it was free and they figured out I’m never going to upgrade. kevinkoder.tk is officially down. kevinkoder.hostei.com still works, though

Its been quite a while since my last post. But I’ve been busy : Skyrim, total war, FIFA, swimming lessons…..whew! And even though i DID do some coding all this while, I believe its too premature to share. And what now? There’s some news – Tokelau! Apparently, Tokelau is a territory of New Zealand in the south pacific. They’ve got coral reefs, great beaches, and everything else that fits the description of a ‘tropical paradise’.

Tokelau : smallest economy in the world, but drop dead gorgeous!

And what Tokelau has got to do with computers and coding might be of interest to many a soul – they give away free top level domain names. Top Level Domains (TLDs) are what you call domains like .com and .net. That is, the URL looks nice. If you’ve ever signed up for a free web hosting service before you’d know that your site gets a pretty weird name like yoursitename.freewebhosting.com . It certainly doesn’t look nice and it certainly isn’t a TLD. If you want to have something like http://www.mysite.com, you’ll have to register (ie pay) a domain.? Well, .tk is a TLD and you can register there for free. And one interesting news – Tokelau was able to achieve a 10%  increase in its GDP through domain sales alone! So how do you get a http://www.mysite.tk site? Its pretty simple really.

First you’d need a host. A host is some server where you can place all your files (contents of your site) and programs. I recommend 000webhost. Setting up a site at 000webhost will give you a site with the URL http://www.kevinkoder.hostei.com or something similar. Set up a site with any name and upload the your .htm file at the public_html folder. If the previous sentence sounds very very alien to you, I strongly recommend that you learn some html from w3schools.com (its easy as saying 1,2,3,4… out aloud) and do some primary research on creating and uploading webpages. Your site at 000webhost is not going to have a nice URL, for now.

Second, you need to register a domain at dot tk. Goto http://www.dot.tk and register a domain. Registering a domain simply involves creating an account at dot.tk and giving your required URL. (eg : http://www.kevinkoder.tk). They will check if the domain name is already taken and will register it for you if it is not already registered by some one else.

Now, you have a site up and running at 000webhost (or some other web host) with a weird URL like kevinkoder.hostei.com and you have registered for a domain http://www.kevinkoder.tk. Now what you need to do is ‘point’ http://www.kevinkoder.tk to kevinkoder.hostei.com. That is when some one types in http://www.kevinkoder.tk into their browser, the browser should load the contents situated at http://www.kevinkoder.hostei.com (hosted at 000webhost or any other webhost).

To do this, you need to know the nameservers of your host. For 000webhost, this information is available at the ‘account details’ section inside the cpanel. Copy the adress of the nameservers and their ip adresses as well. Now log in to http://www.dot.tk and go to the domain panel. Click the modify button corresponding to your domain name. Now under the DNS settings, you will be prompted to use either their DNS, or just do domain forwarding, or use custom DNS. DNS stands for domain name system. Every site on the internet is actually represented by a string of numbers known as the ip adress. For example, facebook.com might actually be 192.168.16.1 or some other weird number. Everytime you access a site, you are actually connecting to the computer with the corresponding ip adress. But remembering all those numbers is a pain in the ass. That is why we map names to these numbers. DNS is a system which determine which names go to which numbers. To make http://www.kevinkoder.tk to point to the number represented by http://www.kevinkoder.hostei.com, we need to supply the ‘numbers’ at http://www.dot.tk. Select custom DNS. Now you’ll see two boxes with the titles ‘nameserver’ and ‘ip adress’. Paste the information you copied from account details at 000webhost into these 2 boxes. That’s it. You’ve done it.

But http://www.kevinkoder.tk won’t point to the contents of http://www.kevinkoder.hostei.com (or wherever you have hosted your site) until a few hours later. The thing is, it requires around 48 hours for the domain name to ‘propagate’ through the web. So give Tokelau some time and check back later. Mean while, as the deputy ambassador of Tokelau (check the ambassodors programme), you might want to consider persuading a few more people to join 😉

Back to the Basics

I admit that what I am about to write will not really fit in with the ‘theme’ of the blog – share code. But yet, I feel compelled to share a tiny story happening around a windows xp iso image sitting inside a pendrive.

I have an old desktop at my home that is dying and in an attempt to make the remainder of its not so glorious life more useful, I decided to reinstall windows xp and installer a lighter version of linux (I deemed ubuntu 9.04 too heavy for my old war horse).  Now there was a problem – my war horse didn’t have an optical disk drive. The CD-ROM drive was lost in battles long ago and the CD-RW drive that was put in there as a replacement has refused to show any sort of co-operation over the past few years. The damn thing just keeps on blinking whenever I give it a CD to read. So there was only one option left – the USB install.

Installing from a pen drive is a piece of cake as far as linux is concerned, but  with windows xp, it was expected to get a lot more complicated and buggy. And fortunately, Novicorp’s WinToFlash came to the rescue and I was able to create a bootable windows xp installation disk in no time. I erased my entire hard drive using gparted (using a live linux boot from a pendrive) and then plugged in my windows xp bootable pendrive. Everything went fine initially – the ugly blue screen still gave me the creeps. But I had to wait until the setup had loaded all the files to encounter my very first (and hopefully the last) hurdle – the setup didnt recognise my hard disk. It showed the USB stick with the windows setup on it though. I could either change the USB to NTFS and install windows  on it, or I could leave the USB as it is and install windows on it. And yes, I tried installing xp to the USB stick, and got a ‘drive corrupted’ error message. I simply couldn’t get the setup to recognize my hard disk! As usual I went with outstretched arms to my ever benevolent friend google and spent the  next few hours reading forum posts and tutorials. And in the end, after 2 hours of toil, I found the solution – press ESC key when the set up shows the option to convert USB to NTFS!!! I tried it and there came my hard disk, arrayed in all the partitions that it could muster against a beautiful blue backdrop (suddenly blue didn’t look THAT bad).

And it perplexes me as to why I didn’t think of this earlier. Even though I do have the excuse that people don’t go pressing ESC button when things go unexpected, I can say that I would have pressed the ESC button had I been the computer-savvy 6th grader that I once were. And to make me feel even more sorry, it was written at the bottom of the screen ‘press ESC to cancel’. Yes yes, who would have thought that pressing ESC will give you a whole new menu instead of going back to the previous screen right? But given my past record, I can very well vouch that I would have pressed the ESC button just because it was mentioned at the bottom of the screen. This presents me with another dreadful thought – am I losing it? Am I becoming like them? Have my curiosity and knack of pressing the right keys at the right time finally left me? I’ve seen it happen with older people-the amazing guy who was so smart in the 80’s and who can fix his own car being sluggish with computers. They use computers like anybody else – change the wallpaper, set screen savers, and use internet explorer. But they will not by any means install teracopy or use ccleaner to do a clean up. Am I becoming like them? I used to always believe that if some one is not good with computers, if some one cannot figure out the solutions to his computer problems, then its just because he’s not looking around. I’ve always believed that craving to know “Oh, what does this button do?” will eventually lead you to all the wrong problems and all the right solutions.

And that’s quite an awful lot of thought to revolve around an ESC key. So the next time you’re in a fix, hit the damn ESC key first :p

Into the web

I’ve always thought that developing for the web was easy and boring – thanks to my computer teacher who taught me html in 8th grade. However, I used to feel a sense of accomplishment when I saw my marquee going across my web page – and change direction when it hits the edge. And soon enough i came to understand that real programming was a world apart from creating simple web pages in html. And I began to look down on web development as something that is less ‘glorious’ than developing desktop applications.

6 years later, I’m the only programmer in the house who can’t put together a (decent)web site.

I tried learning the Django web framework, all the while hoping that Django is that magical thing that will allow me to write games for the web. It wasn’t, or at least I didn’t have enough enthusiasm to get to the part where Django might turn a little interesting. And I even tried to put together some web pages with some javascript and they looked terrible – and my accounts at various free web hosts started expiring because I never log in. And I really wanted to build one of those cool websites with all those mouse-over effects. That’s when I heard about CSS. Then I hardened up and made up my mind that scripting was not what ‘real’ programmers would be doing and that creating web pages and associated programming was so simple that I shouldn’t be focusing on that – unless I wanted to be a part of the common rabble. This is the single most programming-oriented nasty mindset I’ve ever had till date.

Trying to amend my path, and right my wrongs, I again attempt to climb the web-development mountain. This time I have a rather amusing friend to take along with me – html5. Until recently I thought html5 was the good old html with some extra tags. But boy, the <canvas> tag changed all that. The <canvas> tag in html5 basically lets you draw 2d graphics on to the web page using a scripting language such as javascript. And no, I didn’t know any javascript. But in a fortunate coincidence, I stumbled upon (nothing to do with the stumbleupon site though) a java script library for the processing API around java. what? Let me elaborate :

Back when I was still a school boy my dad’s friend introduced me to this wonderful API called processing. Its built on java and it lets you create colorful graphical applications in no time. Compared to the drab console programming (c++) I was learning at school, processing was a very welcome distraction. I was hooked. It was simple, fast, and so much fun that I soon created the classic pong game in processing. Then the wretched entrance exams got in between and I let processing fade away into some teeny tiny corner of my memory.

processing homepage

Here are some examples : http://www.openprocessing.org/

 

And when I found the javascript library for processing, I had hit my jackpot. Because the library, called processing.js, converts all my processing code into javascipt. That means I can write my pong game in processing (which is really java) and then convert it into javascipt using the processing.js library, and then display it in the canvas.

http://processingjs.org/

Now this is a big deal because implementing your applet as javascipt in the canvas does away with the java plugin and the awful loading time. Also, you can write great programs in no time with the processing API and better yet, you can show it off in a WEBSITE!!!!!So I guess practically that makes me a web developer as well! Watch out watch out here I come 😉 . Until next time.