Film simulations from scratch using Python

Disclaimer: The post is more about understanding LUTs and HaldCLUTs and writing methods from scratch to apply these LUTs to an image rather than coming up with CLUTs themselves from scratch.

Outline

  1. What are film simulations?
  2. CLUTs primer
  3. Simple hand-crafted CLUTs
  4. The identity CLUT
  5. HaldCLUTs
  6. Applying a HaldCLUT
  7. Notes and further reading

There is also an accompanying notebook, in case you want to play around with the CLUTs.

What are film simulations?

Apparently, back in the day, people shot pictures with analog cameras that used film. If you wanted a different “look” to your pictures, you would load a different film stock that gave you the desired look. This is akin to current-day Instagram filters, though more laborious. Some digital camera makers, like Fujifilm, started out as makers of photographic films (and they still make them), and transitioned into making digital cameras. Modern mirrorless cameras from Fujifilm have film simulation presets that digitally mimic the style of a particular film stock. If you are curious, John Peltier has written a good piece on Fujifilm’s film simulations. I was intrigued by how these simulations were achieved and this is a modest attempt at untangling them.

CLUTs primer

A CLUT, or a Color Look Up Table, is the primary way to define a style or film simulation. For each possible RGB color, a CLUT tells you which color to map it to. For example, a CLUT might specify that all green pixels in an image should be yellow:

# map green to yellow
(0, 255, 0) -> (255, 255, 0)

The actual format in which this information is represented can vary. A CLUT can be a .cube file, a HaldCLUT png, or even a pickled numpy array as long as whatever image editing software you use can read it.

In an 8-bit image, each channel (i.e red, green or blue) can take values from 0 to 255. Our CLUT should theoretically have a mapping for every possible color – that’s 256 x 256 x 256 colors. In practice however, CLUTs are way smaller. For example an 8-bit CLUT would divide each channel into ranges of 32 (i.e 256 divided by 8). Since we have 3 channels (red, green and blue), our CLUT can be imagined as a three dimensional cube:

A standard 3D CLUT. Image Credits

To apply a CLUT to the image, each color in the image is assigned to one of the cells in the CLUT cube, and the color of the pixel in the original image is changed to whatever RGB color is in its assigned cell in the CLUT cube. Hence the color (12, 0, 0) would belong to the second cell along the red axis in the top left corner of the cube. This also means that all the shades of red between (8, 0, 0) and (15, 0, 0) will be mapped to the same RGB color. Though that sounds terrible, an 8-bit CLUT usually produces images that are fine to our eyes. Of course we can increase the “quality” of the resulting image by using a more precise (eg: 12-bit) CLUT.

Simple hand-crafted CLUTs

Before we craft CLUTs and start applying them to images, we need a test image. For the sake of simplicity, we conjure up a little red square:

from PIL import Image
img = Image.new('RGB', (60, 60), color='red')
img.show()

We will now create a simple CLUT that would map red pixels to green pixels and apply it to our little red square. We know that our CLUT should be a cube, and each “cell” in the cube should map to a color. If we create a 2-bit CLUT, it will have the shape (2, 2, 2, 3). Remember that our CLUT is a cube with each side of “length” 2, and that each “cell” in the cube should hold an RGB color – hence the 3 in the last dimension.

import numpy as np
clut = np.zeros((2, 2, 2, 3))
transformed_img = apply_3d_clut(clut, img, clut_size=2)
transformed_img.show()

We haven’t yet implemented the “apply_3d_clut()” method. This method will have to look at every pixel in the image and figure out the corresponding mapped pixel from the CLUT. The logic is roughly as follows:

  1. For each pixel in the image:
    1. get the (r, g, b) values for the pixel
    2. Assign the (r, g, b) values to a “cell” in our CLUT
    3. Replace the pixel in the original with the color in the assigned CLUT “cell”

We should be careful with step 2 above – since we have a 2-bit CLUT, we want color values up to 127 to be mapped to the first cell and we want values 127 and above to be mapped to the second cell.

from tqdm import tqdm
def apply_3d_clut(clut, img, clut_size=2):
    """
        clut must have the shape (size, size, size, num_channels)
    """
    num_rows, num_cols = img.size
    filtered_img = np.copy(np.asarray(img))
    scale = (clut_size - 1) / 255
    img = np.asarray(img)
    for row in tqdm(range(num_rows)):
        for col in range(num_cols):
            r, g, b = img[col, row]
            # (clut_r, clut_g, clut_b) together represents a "cell" in the CLUT
            # Notice that we rely on round() to map the values to "cells" in the CLUT
            clut_r, clut_g, clut_b = round(r * scale), round(g * scale), round(b * scale)
            # copy over the color in the CLUT to the new image
            filtered_img[col, row] = clut[clut_r, clut_g, clut_b]
    filtered_img = Image.fromarray(filtered_img.astype('uint8'), 'RGB')
    
    return filtered_img

Once you implement the above method and apply the CLUT to our image, you will be treated with a very underwhelming little black box:

Our CLUT was all zeros, and unsurprisingly, the red pixels in our little red square was mapped to black when the CLUT was applied. Let us now manipulate the CLUT to map red to green:

clut[1, 0, 0] = np.array([0, 255, 0])
transformed_img = apply_3d_clut(clut, img, clut_size=2)
transformed_img.show()

Fantastic, that worked! Time to apply our CLUT to a real image:

This unassuming Ape truck from Rome filled with garbage is going to be our guinea pig. Our “apply_3d_clut()” method loops over the image pixel by pixel and is extremely slow – we’ll fix that soon enough.
import urllib.request
truck = Image.open(urllib.request.urlopen("https://i.imgur.com/ahpSmLP.jpg"))
green_truck = apply_3d_clut(clut, truck, clut_size=2)
green_truck.show()

That’s a bit too green. We can see that the reds in the original image did get replaced by green pixels, but since we initialized our CLUT to all zeroes, all the other colors in the image was replaced with black pixels. We need a CLUT that would map all the reds to greens while leaving all the other colors alone.

Before we do that, let us vectorize our “apply_3d_lut()” method to make it much faster:

def fast_apply_3d_clut(clut, clut_size, img):
    """
        clut must have the shape (size, size, size, num_channels)
    """
    num_rows, num_cols = img.size
    filtered_img = np.copy(np.asarray(img))
    scale = (clut_size - 1) / 255
    img = np.asarray(img)
    clut_r = np.rint(img[:, :, 0] * scale).astype(int)
    clut_g = np.rint(img[:, :, 1] * scale).astype(int)
    clut_b = np.rint(img[:, :, 2] * scale).astype(int)
    filtered_img = clut[clut_r, clut_g, clut_b]
    filtered_img = Image.fromarray(filtered_img.astype('uint8'), 'RGB')
    return filtered_img

The identity CLUT

An identity CLUT, when applied, produces an image identical to the source image. In other words, the identity CLUT maps each color in the source image to the same color. The identity CLUT is a perfect base for us to build upon – we can change parts of the identity CLUT to manipulate certain colors while other colors in the image are left unchanged.

def create_identity(size):
    clut = np.zeros((size, size, size, 3))
    scale = 255 / (size - 1)
    for b in range(size):
        for g in range(size):
            for r in range(size):
                clut[r, g, b, 0] = r * scale
                clut[r, g, b, 1] = g * scale
                clut[r, g, b, 2] = b * scale
    return clut 

Let us generate a 2-bit identity CLUT and see how applying it affects our image

two_bit_identity_clut = create_identity(2)
identity_truck = fast_apply_3d_clut(two_bit_identity_clut, 2, truck)
identity_truck = Image.fromarray(identity_truck.astype('uint8'), 'RGB')
identity_truck.show()

The two-bit truck

That’s in the same ballpark as the original image, but clearly there’s a lot wrong there. The problem is our 2-bit CLUT – we had a palette of only 8 colors (2 * 2 * 2) to choose from. Let us try again, but this time with a 12-bit CLUT:

twelve_bit_identity_clut = create_identity(12)
identity_truck = fast_apply_3d_clut(twelve_bit_identity_clut, 12, truck)
identity_truck = Image.fromarray(identity_truck.astype('uint8'), 'RGB')
identity_truck.show()
Left – the original image, right – the image after applying the 12-bit identity CLUT

That’s much better. In fact, I can see no discernible differences between the images. Wunderbar!

Let us try mapping the reds to the greens again. Our goal is to map all pixels that are sufficiently red to green. What’s “sufficiently red”? For our purposes, all pixels that end up being mapped to the reddish corner of the CLUT cube deserve to be green.

green_clut = create_identity(12)
green_clut[5:, :4, :4] = np.array([0, 255, 0])
green_truck = fast_apply_3d_clut(green_clut, 12, truck)
green_truck.show()

That’s comically bad. Of course, we got what we asked for – some reddish parts of the image did get mapped to a bright ugly green. Let us restore our faith in CLUTs by attempting a slightly less drastic and potentially pleasing effect – make all pixels slightly more green:

green_clut = create_identity(12)
green_clut[:, :, :, 1] += 20
green_truck = fast_apply_3d_clut(green_clut, 12, truck)
green_truck.show()
Left – the original image, Right – the image with all pixels shifted more to green

Slightly less catastrophic. But we didn’t need CLUTs for this – we could have simply looped through all the pixels and manually added a constant value to the green channel. Theoretically, we can get more pleasing effects by fancier manipulation of the CLUT – instead of adding a constant value, maybe add a higher value to the reds and a lower value to the whites? You can probably see where this is going – coming up with good CLUTs (at least programmatically) is not trivial.

What do we do now? Let’s get us some professionally created CLUTs.

HaldCLUTs

We are going to apply the “Fuji Velvia 50” CLUT that is bundled with RawTherapee to our truck image. These CLUTs are distributed as HaldCLUT png files, and we will spend a few minutes understanding the format before writing a method to apply a HaldCLUT to the truck. But why HaldCLUTs?

  1. HaldCLUTs are high-fidelity. Our 12-bit identity CLUT was good enough to reproduce the image. Each HaldCLUT bundled with RawTherapee is equivalent to a 144-bit 3d CLUT. Yes, that’s effectively CLUT of shape (144, 144, 144, 3).
  2. However, the real benefit of using HaldCLUTs is the file size. Adobe’s .cube CLUT format is essentially a plain text file with RGB values. Since each character in the text file takes up a byte, a 144-bit CLUT in .cube takes up around 32MB on disk. The equivalent HaldCLUT png image file is around a megabyte. But png images are two-dimensional. How can we encode three-dimensional data using a two-dimensional image? We’ll see.

Let’s look at an identity HaldCLUT:

The identity HaldCLUT, generated using convert hald:12 -depth 8 -colorspace sRGB hald_12.png

Pretty pretty colors. You’d have noticed that the image seems to have been divided into little cells. Let’s zoom in on the cell on the top-left corner:

We notice a few things – the pixel on the top-left is definitely black – so it represents the first “bucket” or โ€œcellโ€ in a 3D clut and pure blacks (i.e rgb(0, 0, 0)) are going to be mapped to the color present in this bucket . Of course the pixel at (0, 0, 0) in the above image is black because we are dealing with an identity CLUT here – a different CLUT could have mapped the index (0, 0, 0) to gray. The confusing part here is to figure out how to index into the HaldCLUT – let’s say we have a bright red pixel with the value (200, 0, 0) in our source image. If we were dealing with a normal 144-bit 3D CLUT, we would know that a red value of 200 will belong to the index 200 * 144 / 255 = 133 (approximately), and we would replace the color of this pixel with whatever was at CLUT[113][0][0]. But we are not dealing with a 3D CLUT here – we are dealing with a 2-D image, while we have to index into this image as if it was a 3D CLUT.

The entire identity HaldCLUT image in our example has the shape (1728, 1728), and each of those little cells that you see has the shape (12, 144), and there are 144 such cells in a single column of the image (i.e vertically). The HaldCLUT, as you can see, has 12 columns. Hence we have 1728 cells in the entire HaldCLUT, each cell having the shape (12, 144). This is how we index into a HaldCLUT file:

(if the description doesn’t make much sense, it is followed by a code snippet that’s hopefully clearer)

  1. Within each cell, the red index always changes from left to right. In our top-left cell, it changes from 0 to 143. This is the case in each row within each cell – the red index is always 0 in the first column of a cell, and 1 in the second column and so on. Since each cell has 12 rows, in each of these rows the red index changes from 0 to 143.
  2. The green index is constant in each row within a cell, and increments by 1 across cells horizontally, and wraps around. So the pixel at position (143, 0) in the HaldCLUT image represents the index (143, 0, 0), while the pixel at position (144, 0) represents the index (0, 1, 0) and so on. The pixel at position (1, 0) would represent the index (0, 12, 0).
  3. The blue channel is constant everywhere within a cell, and increments by 1 across cells vertically. So the pixel at position (11, 0) will represent the index (0, 131, 0) while the pixel at (12, 0) will represent the index (0, 0, 1). Notice how both the red-index and green-index was reset to 0 when moved down the HaldCLUT image by an entire cell.
The top-left corner extracted from the full identity HaldCLUT. Only the first 3 rows and two columns are shown here (the third column is clipped). Note that the annotations represent the index into the 3d CLUT that pixel represents if the HaldCLUT was instead a normal 3D CLUT. Each cell has the shape (12, 144). When there are two lines in the diagram seemingly coming out from the same pixel, I am trying to show how the represented index changes between adjacent pixels at a cell boundary.

Inspecting the identity HaldCLUT in python reveals the same info:

identity = Image.open("identity.png")
identity = np.asarray(identity)
print("identity HaldCLUT has size: {}".format(identity.shape))
size = round(math.pow(identity.shape[0], 1/3))
print("The CLUT size is {}".format(size))
# The CLUT size is 12
print("clut[0,0] is {}".format(identity[0, 0]))
# clut[0,0] is [0 0 0]
print("clut[0, 100] is {}".format(identity[0, 100]))
# clut[0, 100] is [179   0   0]
print("clut[0, 143] is {}".format(identity[0, 143]))
# We've reached the end of the first row in the first cell
# clut[0, 143] is [255   0   0]
print("clut[0, 144] is {}".format(identity[0, 144]))
# The red channel resets, the green channel increments by 1
# clut[0, 144] is [0 1 0]
print("clut[0, 248] is {}".format(identity[0, 248]))
# clut[0, 248] is [186   1   0]
# Notice how the value in the green channel did not increase. This is normal - we have 256 possible values and only 144 "slots" to keep them. The identity CLUT occasionally skips a 
print("clut[0, 432] is {}".format(identity[0, 432]))
# clut[0, 432] is [0 5 0]
# ^ The red got reset, the CLUT skipped more values in the green channel and now maps to 5. This is the peculiarity of this CLUT. A different HaldCLUT (not the identity one) might have had a different value for this green channel step.
print("clut[0, 1727] is {}".format(identity[0, 1727]))
# clut[0, 1727] is [255  19   0]
# This is the last pixel in the first row of the entire image
print("clut[1, 0] is {}".format(identity[1, 0]))
# clut[1, 0] is [ 0 21  0]
# Notice how the value in the green channel "wrapped around" from the previous row
print("clut[1, 144] is {}".format(identity[1, 144]))
# Exercise for the reader: see if you can guess the output correctly ๐Ÿ™‚
print("clut[12 0] is {}".format(identity[12, 0]))
print("clut[12 143] is {}".format(identity[12, 143]))
print("clut[12 144] is {}".format(identity[12, 144]))

Applying a HaldCLUT

Now that we’ve understood how a 3D CLUT is sorta encoded in a HaldCLUT png, let’s go ahead and write a method to apply a HaldCLUT to an image:

import math 
def apply_hald_clut(hald_img, img):
    hald_w, hald_h = hald_img.size
    clut_size = int(round(math.pow(hald_w, 1/3)))
    # We square the clut_size because a 12-bit HaldCLUT has the same amount of information as a 144-bit 3D CLUT
    scale = (clut_size * clut_size - 1) / 255
    # Convert the PIL image to numpy array
    img = np.asarray(img)
    # We are reshaping to (144 * 144 * 144, 3) - it helps with indexing
    hald_img = np.asarray(hald_img).reshape(clut_size ** 6, 3)
    # Figure out the 3D CLUT indexes corresponding to the pixels in our image
    clut_r = np.rint(img[:, :, 0] * scale).astype(int)
    clut_g = np.rint(img[:, :, 1] * scale).astype(int)
    clut_b = np.rint(img[:, :, 2] * scale).astype(int)
    filtered_image = np.zeros((img.shape))
    # Convert the 3D CLUT indexes into indexes for our HaldCLUT numpy array and copy over the colors to the new image
    filtered_image[:, :] = hald_img[clut_r + clut_size ** 2 * clut_g + clut_size ** 4 * clut_b]
    filtered_image = Image.fromarray(filtered_image.astype('uint8'), 'RGB')
    return filtered_image

Let’s test our method by applying the identity HaldCLUT to our truck – we should get a visually unchanged image back:

identity_hald_clut = Image.open(urllib.request.urlopen("https://i.imgur.com/qg6Is0w.png"))
identity_truck = apply_hald_clut(identity_hald_clut, truck)
identity_truck.show()

Let us finally apply the “Fuji Velvia 50” CLUT to our truck:

velvia_hald_clut = Image.open(urllib.request.urlopen("https://i.imgur.com/31UrdAg.png"))
velvia_truck = apply_hald_clut(velvia_hald_clut, truck)
velvia_truck
Left – the original image, Right – image after apply the “Fuji Velvia 50” HaldCLUT

That worked! You can download more HaldCLUTs from the RawTherapee page. The monochrome (i.e black and white) HaldCLUTs won’t work straight-away because our apply_hald_clut() method expects a hald image with 3 channels (ie reg, green and blue), while the monochrome HaldCLUT images have only 1 channel (the grey value). It won’t be difficult at all to change our method to support monochrome HaldCLUTs – I leave that as an exercise to the reader ๐Ÿ˜‰

Notes and further reading

Remember how we saw that a 2-bit identity CLUT gave us poor results while a 12-bit one almost reproduced our image? That is not necessarily true. Image editing softwares can interpolate between the missing values. For example, this is how PIL apply a 3d CLUT with linear interpolation.

The “Fuji Velvia 50” HaldCLUT that we use is an approximation of Fujifilm’s proprietary velvia film simulation (probably) by Pat Davis

If you want to create your own HaldCLUT, the easiest way would be to open up the identity HaldCLUT png file in an image editing software (e.t.c RawTherapee, Darktable, Adobe Lightroom) and apply global edits to it. For example, if you change the saturation and contrast values to the HaldCLUT png using the image editor, and apply this modified HaldCLUT png (using our python script, or a different image editor – doesn’t matter how) to a different image, the resulting image would have more contrast and saturation. Neat right?

Programming: doing it more vs doing it better


A few years ago, very early into my programming career, I came across a story:

The ceramics teacher announced on opening day that he was dividing the class into two groups. All those on the left side of the studio, he said, would be graded solely on the quantity of work they produced, all those on the right solely on its quality. His procedure was simple: on the final day of class he would bring in his bathroom scales and weigh the work of the “quantity” group: fifty pound of pots rated an “A”, forty pounds a “B”, and so on. Those being graded on “quality”, however, needed to produce only one pot – albeit a perfect one – to get an “A”.

Well, came grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quantity. It seems that while the “quantity” group was busily churning out piles of work – and learning from their mistakes – the “quality” group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead clay.

Jeff Atwood’s “Quantity Always Trumps Quality” post, though he himself took the story from somewhere else.

This little story has had a tremendous impact on how I approach software engineering as a craft. I was (and still am) convinced that the best way to get better at software engineering is to write more software. I was careful enough to not take the story too seriously – I have always strived to write readable, maintainable code without bugs. However, deep inside my mind was this idea that one day I would be able to write beautiful code without thinking. It would be as effortless to me as breathing. “Refactoring code” would be something left to the apprentice, not something that I, the master who has churned out enough ceramic pots, would be bothered with. I just have to keep making ceramic pots until I get there.

Three years later, I am still very much the apprentice. Rather than programming effortlessly, I have learned to program more deliberately. I have learned (the hard way) to review my code more thoroughly and to refactor it now rather than later. I get pangs of guilt and disappointment every time my pull request has to go through another round of review. I am frustrated when I deliver a feature two days late. As an engineer I want to, above everything else, churn out (the right) features as fast as possible.

Today, I came across an essay that would let me resign from my perpetual struggle to “get faster” at engineering:

I used to have students who bragged to me about how fast they wrote their papers. I would tell them that the great German novelist Thomas Mann said that a writer is someone for whom writing is more difficult than it is for other people. The best writers write much more slowly than everyone else, and the better they are, the slower they write. James Joyce wrote Ulysses, the greatest novel of the 20th century, at the rate of about a hundred words a day

William Deresiewicz, Solitude and Leadership

I can strongly relate to this – I would often read and re-read something that I wrote and then I would go back and change it, only to repeat the process again. Though comparing my modest penmanship (keymanship?!) to “the best writers” is outright sacrilegious, even I have in the past noticed that the slower I write, the better I write.

The equivalent in software engineering terms would be to (nothing you did not know before, except for maybe the last point):

  1. Put more thought into the design of your systems
  2. Refactor liberally and lavishly
  3. Test thoroughly
  4. Take your sweet time

As I said, nothing you did not know before. Also, this is almost impossible to pull off when you have realistic business objectives to meet.

But James Joyce probably did not write Ulysses with a publisher breathing down his neck saying “We need to ship this before Christmas!”.

So the secret sauce that makes good code great and the average Joe the next 10x programmer might be this – diligence exercised over a long time.

How does this affect me? Disillusionment. Writing more software does not automatically make you a better programmer. You need the secret sauce, whatever that might be.

Announcing matchertools 0.1.0

Matchertools is my “hello world” project in rust, and I have been chipping away at it slowly and erratically for the past couple of months. You can now find my humble crate here. The crate exposes an API that implements the Gale-Shapley algorithm for the stable marriage problem. Read the wiki. No really, read the linked Wikipedia page. Lloyd Shapley and Alvin Roth won a Nobel prize for this in 2012. Spoiler alert – unlike what the name indicates, the algorithm has little to do with marriages.

This project is so nascent that it is easier for me to list what it does not have:

  1. No documentation
  2. No examples
  3. Shaky integration tests
  4. No code style whatsoever. I haven’t subjected the repo to rustfmt yet (gasp!)
  5. Duct-tape code.
  6. Not nearly enough code comments.

Meta

I had recently adopted a new “philosophy” in life:

Discipline will take you farther than motivation alone ever will

Definitely not me, and more a catch-phrase than philosophy

Most of my side projects do not make it even this far. I go “all-in” for the first couple of days and then my enthusiasm runs out and the project is abandoned before it reaches any meaningful milestone.

But I consciously rate limited myself this time. I had only one aim – work on matchertools every day. I did not really care about the amount of time I spent on the project every day, as long as I made some progress. This also meant that some days I would just read a chapter from the wonderful rust book and that would be it. However, I could not stick to even this plan despite the rather lax constraints – life got in the way. So my aim soon degenerated into “work on matchertools semi-regularly, whenever I can, but be (semi) regular about it“. Thus in two months, I taught myself enough rust to shabbily implement a well-known algorithm. Sarcasm very much intended.

Though I was (am) horrified at the painfully slow pace of the project, the “be slow and semi-regular but keep at it” approach did bear results:

  1. I learned a little rust. I am in love with the language. The documentation is superb!
  2. I produced something, which is far better than my side-project-output in the past 18 months – nothing.

Besides, I have realized that much of what happens around me is unplanned and unpredictable to a larger degree than I had thought. I am currently working on revamping the way I plan things and the way I react when my plans inevitably fail. A little Nassim Nicholas Taleb seems to help, but more on that later.

Web design for programmers : A 10 minutes crash course

I’m not a designer, and I’d rather not be one. However, there are times when programmers who don’t like to design (or draw, for that matter) are forced into that tedious act. I was responsible for designing the front end of a product at a company I interned at for the last 2 months.

Needless to say, html + css was terrifying for me. There were days where I spent entire mornings trying to align the bloody divs. Also, my choice of colors and “ui elements” were not at all pleasing. I had to pull this together somehow. I scoured the web for some intro to design. So here’s what 2 months of front-end taught me :

1. For the love of God, use bootstrap. No matter how promising the control and flexibility of pure css looks, use bootstrap and save the headache – at least when you start out.

2. Use a pen and paper to sketch your design. If you don’t like pens or papers, use a wireframing tool such as wireframe.cc. I spent some considerable time building wireframes, and then threw them away when I changed the design. Lesson learned – use pen and paper. Wireframes are useful when you want a more detailed/accurate layout of your web app.

3. Chances are that you are terrible at choosing colors. Use a tool like paletton to find the right colors, and the right combination of colors.

4. Use good fonts. Microsoft’s Segoe UI is now my favourite font. Segoe UI wasn’t featured in even a single article that discussed the “best free web fonts”. Experiment.

5. Don’t use too many colors, and don’t use too many fonts. Try to keep it simple whenever possible.

6. The official bootstrap docs does not contain references of some really useful bootstrap components like “panel” and “panel-default”. So be sure to double check before you decide that bootstrap doesn’t have it already.

7. You can’t come up with a “mind blowing, innovative, revolutionary design” over night. You might, but chances are that you won’t. Always try to build upon designs (please don’t use templates) that already exist. Here are some useful links for you to ‘build-upon’ :

8. Don’t be afraid to rewrite the HTML. I had to design a signup form and my first implementation sucked. The HTML was a mess and I couldn’t even think of modifying it. So I just wrote that page again, from scratch. Not only did I come up with a wonderful new design and styling (hint: tiles and css shadow on hover), the HTML was much much more readable. Break and build, break and build.

Good luck.

Cohen’s clipping algorithms

Okay this was homework. I searched for a really long time for a javascript implementation of cohen’s clipping algorithms and could find none. Professor said write it in c but its hard to program mouse clicks in c. With javascript, all it takes is a browser.

1. Cohen-sutherland line clipping algorithm in javascript

2. Sutherland-Hodgman polygon clipping algorithm in javascript.

cohen-hogman polygon clipping in action
cohen-hogman polygon clipping in action

I believe the code is pretty readable – I had commented lavishly. Save them as html files, open in a browser, and keep clicking left mouse button.

And yes, the implementation is not perfect. I basically drew over the edges in white to “erase” it and that is why you see a very thin line outside the rectangle in the image.

Sound frequencies with aubio

Small python script I wrote so that you can yell at the console and see the frequency on the screen. The results can be slightly wrong (incorrect spikes in frequency occasionally) but it was great yelling at the computer with my hostel mates to see who’s got the highest ‘range’ ๐Ÿ˜€

Link to the github gist.

The code is too small to give an explanation. However, you need to set up a few libraries before running the gist (instructions for linux) :

1. aubio – A fantastic library for analysing audio. Packages libaubio and python-aubio are available in the ubuntu/mint repositories. However, I ran into problems (repos have older versions I guess) and was able to fix them only after compiling the source. So head over to this repo, download the source code, and compile.

To compile aubio, head over to the source directory and type:

./waf configure

That will spew out a list of packages you will need at the end. Make sure you install the dev versions of each package. For example, for sndfile, do

sudo apt-get install libsndfile1-dev

 

Similarly install all the packages that you would need to use with aubio. I did not have a clue as to what I will need so I installed them all.

Now do ./waf build
and then sudo ./waf install

That should install aubio on your linux system. Time to install the python wrappers. ‘cd’ to /python directory in the aubio source.

python setup.py build to build the files and after building,
sudo python setup.py install to install the python wrappers for aubio

 

2. The snippet depends on pysoundcard, which is not available in the repos. Head over here to download the source. Build and install this python package the same way you did the aubio python wrappers

Download (or type) the gist and run it! Happy yelling!

GSoC : Final report

Putting together a quick report of how I spent my last 3 months on improving varnam, an awesome transliteration project. My task was to implement a stemmer to improve the learning in varnam.
A stemmer is an algorithm that, upon giving a word as the input, gives the base word as the output.

For example, giving เดฎเดฐเดคเตเดคเดฟเดฒเต‚เดŸเต† as the input would give you เดฎเดฐเดคเตเดคเดฟเตฝ and เดฎเดฐเด‚ as outputs. เดฎเดฐเด‚ is the final output of the stemmer and เดฎเดฐเดคเตเดคเดฟเตฝ is an intermmediate output of the stemmer. The algorithm is described here. The stemmer is similar to SILPA stemmer created by Santhosh Thottingal except that my version makes use of an exceptions table and produces meaningful intermmediate words.

A screencast that explains my work is posted above. Make sure you watch it in 720p to clearly see the words being typed.

As far as statistics go, see this thread to know how much the learning has improved. This is not the final result, as the number of words learned is of no consequence if the stemmer does not improve transliteration accuracy. Transliteration accuracy tests before and after the tests are yet to be done thoroughly. Judging by the number of new words in the word corpus alone, varnam saw an improvement of 63% in learning when tested with 408 words.See the above thread for the exact results and the word corpus used.

GSoC : Memory heap corruption and code rewrite

This week I’ve been busy rewriting the stemmer and debugging some memory heap corruption. My first implmentation of the stemmer used to crash ibus whenever certain words, like “เดฆเต‚เดฐเต†เดฏเดพเดฃเต” and “เดตเดฟเดฆเต‚เดฐเดฎเดพเดฏ” were typed. I could not locate the problem, and the only error message I got was “free() – invalid next size” when ibus crashed. Some searching revealed that it might be due to a memory heap corruption. I used valgrind memcheck to debug the memory corruption. It was difficult to make sense of valgrind’s output, and that eventually lead me to ask a question at stackoverflow. However, before all this, I was convinced that I made some serious mistake somewhere along the development path and decided to sit down and rewrite the whole project. I thought that I made a mistake by not testing with ibus early on. I discovered what I was doing wrong to merit the memory corruption soon after (even before the guy came in and gave his answer at stackoverflow.com). However, I realised that a rewrite would do the project much good. To start with, I could then run valgrind as I went with the rewrite to make sure that I plugged all the possible memory leaks. Also, I was able to look into some unnecesary function calls among other things. In short, I cleaned the code and is ready for a code review.

Here’s a changelog:

1. Tried implementing the “improvement scheme”, as I had suggested in this thread. The results were far worse than expected. 60% of the words after suffix appending were not meaningful. Any further attempts along this path would require much more careful planning and reasearch of the malayalam language.

2. Located and avoided [did not stonewall it] an annoying memory corruption. Filed it under issue 51.

3. Removed the level hierarchy. All stemrules are now grouped into one. Splitting the stemrules into 3 levels serve no real purpose, and complicates stemming by needing to check each level seperately. Also, removal of the level system has improved the code readability a lot.

4. Replaced some function calls with inline expansions. Made all the functions more defensive and freed memory wherever valgrind reported memory leaks.

5. Libvarnam ibus requires a clean build every time libvarnam.so changes. It seems that libvarnam-ibus has its own version of libvarnam or something. Should look into this. Ibus not reflecting the changes I made to libvarnam was a real headache – no amount of debugging could solve the issue. Tried recompiling libvarnam-ibus and things started to work.

6. Eliminated recursive calls to varnam_learn(). In the first implementation, varnam_learn() would call varnam_stem() which calls varnam_learn_internal(). This was bad design. Now varnam_stem() returns a varray to varnam_learn(), and varnam_learn() iterates over this varray to learn all the stemmed words.

These changes are not final. Some of it, like doing away with the level system, was done without consulting my mentor and would be reintroduced if he thinks that removing it was a bad decision. You can see all my changes here and make suggestions.

To do :

1. More tests
2. Make sure stemmer works well with other languages
3. Enable varnam to stem from the command line interface

GSoC : Code review 1, almost.

Before more thorough testing of the stemming algorithm and its effect on varnam’s learning, my mentor and I decided that it would be a good idea to do some code review. So this week I fixed some problems with the stemming, tested how the stemming works with ibus input method, checked if learning is improving at all, and wrote some unit tests.

Stemming with IBus works, though with some bugs. Let us consider a case that works. The learnings database is now empty and we are starting with the blank state. Varnam does not know anything other than the symbols specified in the scheme file.
The below video demonstrates varnam learning a word with Ibus as the input method. The next time the user starts to type the same word, you can see that its stemmed forms are available in the suggestions.


Right now the only cause of concern with the suggestions is that incomplete words are suggested first, and the user has to go through the suggestions list to find the intended word. Also each time varnam learns a stemmed word, all its prefixes are learned as well. This will eventually lead to the incomplete prefixes coming up first on the suggestions list and the user will have to look through the list to find the word she is looking for.

There are some bugs, like some words dissappearing when I choose them from suggestions. The varnam_stem() function is possibly modifying some things that it isn’t supposed to. I’m also getting errors when I’m using free() – invalid next size(fast). Maybe the upcoming code review will expose my mistakes.

GSoC : Exceptions table and some testing

Progress has been slow the past week, thanks to some non-academic preoccupations and a trip home. However, had I been a bit more organized, I would have been more successful at the rather mundane task of testing out the stemming accuracy.
There are some design changes. Some stem rules did not gave the desired results in all cases. That is,there were exceptions. One particular stem rule that was giving me considerable headache was “เดจเต” => “เตป”. For example, เด†เดฆเดฟเดคเตเดฏเดจเต should stem to เด†เดฆเดฟเดคเตเดฏเตป. But while this worked wonderfully, เดชเดฟเดฑเตเดฑเต‡เดจเตเดจเต would be incorrectly stemmed to เดชเดฟเดฑเตเดฑเต‡เดจเตเตป. This is because เดจเตเดจ is actually a combination of เดจเต and เดจ. So เดจเตเดจเต is actually เดจเต+เดจเต and my algorithm stems the first เดจเต to เตป (see previous post).

This problem can be solved by using a look ahead. A look ahead in its proper and fully scalable (that is, an algorithm that can look ahead any number of characters) can turn out to be too much so I decided to test the idea with a single look ahead. Along with stem rules, I added another table to the database “stem_exceptions” that contain exceptions for each stem rule. For example, the exception rule for เดจเต is “เดจเต” => “เดจเต”. This tells varnam to NOT stem เดจเต to เตป if the syllable preceding เดจเต is another เดจเต. This will ensure that varnam will เดจเต to เตป in all cases except when it occurs as a part of เดจเตเดจเต.

Lucky for me, the exceptions table proved useful with many other stem rules. A look ahead of a single syllable seems to satisfy varnam’s need at least with malayalam. I had to implement some helper functions that returns the last syllable of a word (eg: in เด†เดฆเดฟเดคเตเดฏเดจเต, the last syllable would be “เดจเต”, and the last unicode character would be “เต”) and another that can count the number of syllable in a word. The count of the syllables is useful to skip stemming of very short words. For example, varnam do not apply a stem rule if syllables_in_original_word – syllables_in_suffix is less than 2. The number 2 is arbitrary, but solves some common problems such as เดฎเด•เตพ. As a happy consequence, now varnam will not stem เดฎเด•เตพ at all but will stem เดชเต‡เดจเด•เตพ to เดชเต‡เดจ. Though this is not a permanent nor a complete solution, it is enough to prevent some common stemming mistakes.

I’ve been able to test the accuracy of the algorithm on some malayalam wikipedia articles. I made 3 sets of about 1000 words each. Contents of each set belonged to a particular category. My rather small test data is hosted at this repository. Here are the results for each set:

history_wikipedia – 94%
Technical_wikipedia – 89.7%
Art_wikipedia – 92.6%

Give or take 2% from each set, though I’ve been quite liberal in flagging results as errors. The fact to be noted is that if a word that should not be stemmed is not stemmed, it counts as a correct result. I do not know if this is how other stemming algorithms are tested. If 3000 definitely stemmable words were given as input, there is a considerable chance that the accuracy would be lower.

I would have loved to test the data on some more recent corpus such as mathrubhumi newspaper archives. But there was some issue with the font, especially the chillus, that represented the malayalam letters quite differently on the konsole than how they were rendered on the browser. For example, words ending with เตฝ in the browser was seen to be ending with เดฒเต when I copied them to the konsole. Hence, the stem rules did not match with many suffixes and produced a lot of incorrect stemming or no stemming at all.

One thing I’m happy to observe is that given a word, the stemmer is producing multiple words that varnam can learn in diffent stages. For example, เด•เดพเดฒเด™เตเด™เดณเตเดŸเต† would first stem to เด•เดพเดฒเด™เตเด™เตพ (which varnam learns) and then to เด•เดพเดฒเด‚ (learns again). If all goes well, I will be able to test and tweak the algorithm extensively this week and hopefully start estimating how much the suggestions are improved. Then, I hope, will be time for some code reviewing with my mentor.