Posts

Showing posts with the label erlang

I Second That Emotion

So Tim Bray finds out that Erlang IO is slow . I can attest to this fact, as my recent work on reading large files in Erlang has shown that IO and string manipulation is much slower than I would have wanted. Yes, like Bray, my file reading is single threaded (although, what I do with the line is very multi-threaded) so I suppose using a single thread for Erlang isn't very Erlang-like in the first place. In the meantime, I'm porting my OLAP cube generator to Scala. The assumption (and shortly, hopefully proof) is that the JVM can do file IO much better than Erlang, yet I can still take advantage of Scala's Actors to retain my concurrency. Update : OK, some numbers and code. This is a benchmark for Erlang and Scala to read in a file line by line. First, the Erlang code: process_file2(Filename) -> {ok, File} = file:open(Filename, read), process_lines2(File). process_lines2(File) -> case io:get_line(File, '') of eof -> file:close(File); _ -...

No SMP Erlang for Windows?

I was trying to enable SMP on my Windows Erlang install, but I was shocked to find out that SMP is not supported in the official Windows Erlang distribution. I confirmed this by reading the Errata for Programming Erlang : #8893: "SMP Erlang has been enabled ... since R11B-0 on all platforms where SMP Erlang is known to work." PLEASE mention that Windows is NOT among them. What the heck? That seems like a horrible little secret of the Erlang world. I don't want to compile my own Erlang on windows just to get SMP (which, imo, is the killer feature of erlang).

Tail Recursive Line Processing in Erlang

When I was testing performance in Erlang using funs , I mentioned I wanted to see what would happen if I made the functions tail recursive . I just took a crack at it, and it looks like making this particular function tail recursive didn't help performance. (as always, please let me know if my assumptions are incorrect.) Tail Recursive Line Processing process_file(Filename) -> {ok, File} = file:open(Filename, read), process_lines(File, first, 0). process_line(eof) -> done; process_line(Line) -> L = string:tokens(string:strip(Line, both, $\n), " "), {Dimensions, Measures} = lists:split(10, L), lists:map(fun(X) -> {I,_} = string:to_integer(X), I end, Measures). process_lines(_, eof, _) -> done; process_lines(File, PreviousLine, LineNum) -> Line = io:get_line(File, ''), process_line(Line), process_lines(File, Line, LineNum + 1). Running this code agains a file of 10037355 lines and 1028071833 bytes big, it takes on average 401.40 seconds. ...

Erlang Fun Results

As I code more and more in Erlang , I'm very interested how to achieve the best performance possible. I'm very new to the language, so I'm still unsure what the best practices are concerning performance. Tonight I was testing the effects of using a fun() in an algorithm. My test concerns reading a tab delimited text file, tokenizing it, and converting any numbers into integers. I've split the program into two conceptual parts: 1) the file IO and line reading, and 2) the handling of the line. I wanted to test the performance differences between using a fun() for the line handling vs just including the line handling code directly. My test text file is 10037355 lines long and 1028071833 bytes big. I compiled my code using HIPE . The quick answer is, using a fun() is slightly slower than not using it (which is to be expected). For my particular test, using a fun() was approximately 20% slower. test Run 1 (sec) Run 2 (sec) fun() 484.631 485.380 without fun() 404.017 4...

Enabling SMP Support for Erlang on Mac OS X

If you are working with Erlang on Mac OS X and have installed it via Mac Ports , then you might not be running the SMP enabled erlang. To check, start erlang as erlang -smp If you get: Argument '-smp' not supported. Then your erlang was not compiled with SMP. All you'll need to do is: sudo port uninstall erlang sudo port install erlang +smp Then when you run erlang -smp , you'll be dropped right into the erlang shell. When you run anything with multiple processes, you'll see both of your CPU cores active.

OLAP Cube Construction with Erlang

I've managed to build the parallel OLAP cube constructor in Erlang . This program achieves parallelization through creating a process for every dimension in the OLAP cube. Each process manages the file that holds the dimension data. Messages are passed from the first dimension all the way down to the last dimension which stores the measures themselves. To further parallelize things, you can partition any dimension using a modulus, which creates another file and process. This helps get around the 2 GB limit for dets tables . I also have basic path based querying working, which is also parallelized through sending the query message through each dimension. While the querying itself isn't parallel for a particular client, it will theoretically scale to handle many clients. When I move to move traditional querying to generate a traditional tabular result set, I will be able to parallelize the query for a single client. I'll post the working code once I can choose a suitabl...

Parallel OLAP Cube Construction

Still tilting at windmills here , with my quest to calculate an OLAP cube still trotting along. Lately I've been using Erlang to see if I can use its concurrent programming abilities to scale the OLAP cube generation. The initial progress is promising, with clear concurrency complete. I hope to post the majority of the code soon, after I run a few more tests. For now, here's how I add two lists of numbers in Erlang. Please let me know if there's a better way. Update : The best way to do this is (thanks to Daniel Larsson): lists:zipwith(fun(X,Y) -> X+Y end, [1,2],[3,4]) Old 'n busted way: add_measures(OldMeasures, NewMeasures) -> add_measures(OldMeasures, NewMeasures, []). add_measures([Old|OldRest], [New|NewRest], Accum) -> add_measures(OldRest, NewRest, [Old+New|Accum]); add_measures([], [], Accum) -> lists:reverse(Accum).

Calculating Combinations In Ruby From Erlang

Well, thanks to the many people ( here and here ) that provided their versions of an erlang way to calculate combinations , I've really begun to open my mind to how to think functionally. To help me understand what is going on, I've converted the basic idea into a Ruby version of calculation combinations. This uses recursion like the erlang versions do. class Array def head_tail [self.first, self.tail] end def tail self[1,self.size-1] end end def combos(list) return [[]] if list.empty? h, t = list.head_tail t_combos = combos(t) t_combos.inject([]) {|memo, obj| memo << [h] + obj} + t_combos end c = combos([1,2,3,4]) require 'pp' pp c As you can see, I added a bit of erlangism to the Array class, by adding a method to get the head and tail of an array. Let's run through this. On the first call to combos([1,2,3,4]) we jump over the first line (the exit in our recursion). We generate the head and tail, which in this case is 1 and [2,3...

Calculating Combinations the Erlang Way

If you recall, I wrote some Ruby code to calculate combinations of values in lists . I needed to create a list of all combinations of values, where each combination had between 0 and N number of values, where N is equal to length of the source list. (I'm not sure I'm explaining that correctly, but refer to my previous post for examples). Here's my first shot at how to do this in erlang . It look longer to find math:pow and how to convert a float to an integer in erlang than to write the actual code. -module(s). -export([combos/1]). combos(L) -> combos(L, bit_masks(length(L))). combos(L, [BH|BT]) -> [mask_list(L, BH)|combos(L, BT)]; combos(_, []) -> []. mask_list([H|T], [BH|BT]) -> case (BH) of 1 -> [H|mask_list(T, BT)] ; 0 -> mask_list(T, BT) end; mask_list([], []) -> []. bit_masks(NumColumns) -> bit_masks(0, round(math:pow(2, NumColumns))-1, NumColumns). bit_masks(Max, Max, NumColumns) -> [padl(NumColumns, bl(Max))]; bit_ma...