Skip Ribbon Commands
Skip to main content

Blog

:

Quick Launch

Home
Blog of Rick Taylor, containing my thoughts, comments and questions.
June 12
Parallel Processing – Some Notes

I have been doing some reading and researching into parallel processing recently. Massive parallelism has always been an interest to me, and has in fact been something I’ve played around with for some time now, resulting in such geekery as building a small redhat linux based Beowulf cluster in my apartment some years ago. Currently, all the computers in my house participate in an Alchemi cluster, which is a .Net-centric way of cluster programming/grid computing and well worth looking into. It is in many respects similar to Digipede (in fact, many things are directly portable), but is open source. It’s quite a fun toy and very powerful.

 

Some of the things I have been reading up on are classical ways of looking at massive parallelism, such as Microsoft’s HPC initiative. Some other things are new and pretty exciting, such as the MS Accelerator project, ParallelFX, and the stuff Google is doing with MapReduce. There is a lot of information out there, and this subject is deep as well as wide, so I’ll try to present this as pointers to where you can find more information, with some notes as to what I found to be cool on each bit.

 

Traditional Approaches

 

On the more traditional side of things, there is Windows HPC Server. Windows HPC Server consists of Windows Server running MS Compute Cluster Server, which is additions to Windows Server 2003, and (soon) Windows Server 2008 which allow Windows servers to be used to create traditional MPI based clusters, based on the MPICH2 standard, with some optimizations and bug fixes. In most cases, scripts are directly portable to Microsoft’s MPI implementation, and programs should work with a recompile. There are some restrictions, however, such as being restricted to a single head node. Otherwise, topologies should be directly implementable. A good (if high level) overview of the programming environment is available by reading the Developer Tools Datasheet, which also gives some overview of some of the newer and less traditional technologies being researched (or already available) from Microsoft. Also, on MPI, for anyone interested in the subject, I can highly recommend the book Using MPI, available from MIT Press.

 

GPGPU (General Purpose GPU) Programming

 

So, you know that graphics card you have in your computer? It turns out it’s a pretty powerful processor in its own right. That’s pretty obvious, but, what might not be so obvious is that that power is usable from programs you can write. There are restrictions, and programming to graphics cards is in quite a lot of ways not standardized. For instance, NVidia provides a toolkit called CUDA, which allows you to do general purpose programming on their (newer) graphics cards. Similarily, ATI provides its CTM (Close To the Metal) toolkit. Neither are compatible with each other, and both provide a low-level way of using the graphics processor to do general purpose processing.

 

There are limitations to what you can do with a GPU. For instance, branching is generally unavailable. Yep, you don’t get an “if” statement. No case/switch statement, either. Looping? Nah. Also, programming to these devices is a different paradigm than what you may be used to. GPUs use what is called data parallelism, which means that you generally have a large array of data, and one instruction that you are going to apply to the whole set (this used to be referred to as SIMD processing – Single Instruction, Multiple Data). Programming in this way allows the GPU to split the data up, and, since GPUs generally have many subprocessors (albeit less powerful than the one(s) you might find in your CPU), allows the requested instruction to be run in parallel on many subprocessors. Another restriction is that the data arrays are immutable. You perform an instruction on a data array, and get another back, but the original never changes. Note the tie-in here to functional programming and functional languages such as LISP, Scheme, and, more interesting to me at this point, F#.

 

One cool thing that I have found recently on the subject of GPGPU programming is the Accelerator project from Microsoft Research. What it does is simplify the process of programming to a GPU by abstracting the underlying implementations (such as CUDA and CTM), and allowing you to program in a higher level language, such as C#.

 

So why would you do GPGPU programming? Because it’s incredibly fast! Check out the Game of Life sample in the Accelerator project. It takes a hit because it must translate from C# down to CUDA (in my case), and ship everything off to the GPU for processing, but, even with that hit, on my machine here, it is quite easily 16 to 19 times faster than the Life implementation I did in F#. An order of magnitude and then some! Very, very impressive.

 

Here is a MS Channel 9 video on the Accelerator project, with David Tarditi and Sidd Puri, who are researchers on the project. Very cool.

 

Distributed, Redundant, and Functional

 

Scatter, gather, map, and reduce. These are terms that are used with distributed, parallel, and functional programming. Because you are programming in a functional manner, and will not change the original data (ie, your functions do not have side effects), your data and its accompanying algorithm can be distributed (scattered) to an arbitrary number of processors. Your function can then be run (mapped) across all that data. The results can then be gathered from those multiple processors, and, if need be, the multiple results can be reduced into a smaller result set. This again is the SIMD approach mentioned in the section on GPGPU programming, and is a common theme across related technologies such as functionally based languages.

 

Add one more bit into the mix here: Computers fail. All the time. In fact, you could consider that, in any given set of compute resources, a certain number will be down at any given time, for hardware reasons, software reasons, or just that Billy tripped over the power cord. It happens. In fact, it happens all the time, and the systems in this section assume that it is the natural state of hardware to have failures. These systems, because they take a functional approach, can move processing tasks from one set of hardware to another, and continue with working hardware when (not “if”!) some compute node or set of nodes fails.

 

There are several systems that act in this manner, most notably at this time Google’s MapReduce, Hadoop (2), and Microsoft’s Dryad research project. Each are massively parallel and redundant systems capable of automatic failure recovery, geared towards functional programming. MapReduce is what Google uses to do its searches (among other processing tasks). An overview and tutorial is available from Google here. Hadoop is an open source implementation of the MapReduce concept, written in Java, which you can download and install on COTS hardware. It is available for Linux and for Windows, but only with Windows as a development platform. More information and a tutorial can be found here. Dryad is a similar technology available from Microsoft Research, but also has tie-ins to LINQ, allowing this technology to be more readily accessible to programmers in general, and to make this type of technology, which traditionally has been used for research, mapping the human genome, secret DoD projects and the like (check out the Top500 list), to be more readily accessible to less ivory-tower uses of computing horsepower at this scale.

 

Our New Robotic Overlords

 

Another area that is producing interesting and quite capable results in the realm of parallel processing, and making that level of computing power available to the average programmer is MS Robotics and the CCR (Concurrency and Coordination Runtime). A very interesting set of videos on what the CCR is and what you can do with it can be found here and here. Basically, robots need a lot of things to happen all at one time. There needs to be a good, robust, and easy to code to way to manage all these concurrent threads, without forcing the end programmer to build thread management, processor affinity management, and other related issues into their code – there should be a way to make this easy, especially since many- and multicore machines are becoming more commonplace. The CCR provides that functionality, and is finding uses outside the realm of robotics.

 

 

 

ParallelFX

LINQ is without a doubt a powerful addition to the .Net framework. With it, you can query over disparate data types, perform reductions, projections, and all the other neat stuff normally associated with database engines of various stripes. But, in order to get to that type of ability, certain other aspects needed to be added into the framework, such as the ability to construct lambda expressions and the ability to create queryable groupings of data. All this has added some decidedly functional aspects to some largely imperative and object oriented languages. Because of this, and, if you write your code so that anything you use inside a LINQ query adheres to some of the tenets of functional programming (that is, no side effects),  LINQ and its related functionality naturally proceeds to a state where parallelism is the obvious next step.

 

Enter ParallelFX. ParallelFX adds several things to the .Net framework, such as parallelism in queries, and loop parallelism. For instance, assuming you have written a LINQ query in a way where it can take advantage of it, PLINQ (Parallel LINQ) can split your query across available processors in your machine, speeding up the query significantly. Also, even if your machines are not multi-core right now, if you write your software in a way where it can take advantage of parallelism, PLINQ will automatically take advantage of multiple cores if they become available, or run without change on a single core machine if that is all it has.

 

Changing your queries is simple. The AsParallel() extension method provides the tie in to PLINQ. For example, a slight variation on some of the code in my A* in LINQ example from a while back might look like :

 

var neighborsQuery = from n in currentNode.Neighbors.AsParallel()

                     where n.NodeState == NodeState.Undetermined

                     select n;

 

ParallelFX also adds in the TPL (Task Parallel Library), which allows you to parallelize loops. For instance, a for each loop like this one

 

foreach (Node<Point> node in domain)

{

   GetNeighbors(map, node, domain);

}

 

Becomes a statement like this

 

Parallel.ForEach(domain, node => GetNeighbors(map, node, domain));

 

- where the lambda expression describes the function that will be mapped over the available data. If possible, the TPL will scatter the data and the function across available processors and cores. If there is only one available, it runs just like the original foreach.

 

ParallelFX can be downloaded here.

 

 

Going Deep

 

Aside from those specific subjects, I have discovered basically a gold mine of information on parallelism in the Going Deep series on Channel 9 (there’s also a lot, lot more neat stuff there than just the parallel programming stuff. It’s well worth checking out). Of particular interest have been the videos involving Erik Meijer (language designer and researcher at Microsoft), Bertrand Meyer (father of the Eiffel language, among many other things), and Anders Hejlsberg (chief architect over C#, and also such languages as Delphi and Turbo Pascal):

 

Anders and Joe Duffy on ParallelFX – very interesting talk on the subject. Anders makes a sideways reference to Gödel numbers J

 

Erik Meijer on Functional Programming – Erik describes F#, contrasts with Haskell, discusses functional programming in C#

 

Erik Meijer, Bertrand Meyer, and Yuri Gurevich discuss concurrency and use some classic problems (such as the sleeping barber problem) as illustrations

 

Burton Smith on General Purpose Supercomputing

 

Dan Reed on Cloud Computing

 

 

May 12
Exploring F#  with Conway’s Game of Life

The other day, I asked around here at Vertigo to see if anyone understood some new concepts that I had run across, those being monads and continuations. These are both concepts that you find in more functional languages such as Lisp, Scheme, Haskell, and so forth, but are also finding their way into more mainstream languages – notably in LINQ, due to its relationship to more functional ways of programming (such as its use of lambdas). We have a good environment here for asking questions like that, and the conversation soon evolved into other areas. One subject that came up was that of F#, which is a language that has come out of Microsoft Research. I had heard of the language before, but had not really explored it. I happened to have a little bit of bandwidth over the last few days, so, I thought I would do a little spelunking, and learn something new. Here's what I found J

So what's this F# thing, anyway?

What I found out is that F# is a programming language based on ML and OCaml. These are both languages with a high degree of functional programming aspects to them (which is not the opposite of dysfunctional programming), but which, unlike more 'pure' functional languages such as Scheme or Haskell, allows for more imperative styles of programming than those do – structures such as for and while loops exist, and look much as you would expect.

Unless you specify otherwise, everything in F# is immutable, much like the string construct in C#. This extends to areas that you might not expect. For instance, once you set an int to some specific value, that's it – you can't change it, unless you have marked it as mutable. The reason for this is because the language is primarily a functional one. Programs in functional languages contain functions which return values, which are then further used, etc – but each value or set of values is its own entity, and to change it while within a function is to produce a side effect, something undesirable in a functional language. In the strictest sense, functions return values but do not alter their parameters, or the outside world (which, incidentally, gives rise to the monad pattern mentioned earlier). F# follows these rules of functional programming, but also allows you to break those rules with the mutable keyword.

In F#, indentation is significant. For instance, take this bit of code out of my sample project :

if(i>0)then

   newRect.up <- rectsArray.[i-1,j].rect 

   rectsArray.[i-1,j].down <- newRect.rect

 

In that sample, everything indented under the "if" is part of the if statement's block. (note: this is using the #light directive, which imposes these rules)

F# uses type inference for most things. For instance, given a statement such as

let gridSize = 45

 

the compiler determines by its use that gridSize has to be an integer. This is not a variant from the old days, but instead is the rough equivalent of the var keyword in C# - the language is typesafe, but most often types are inferred. You can, if need be, specify the type explicitly, such as :

let redBrush : SolidColorBrush = new SolidColorBrush(Colors.Red)

but F# seems to do a pretty good job of figuring things out.

F# also makes extensive use of lists, which are not the generic lists you would find in the .Net framework proper. Instead, they bear more of a resemblance to lists in LISP. A short discussion of what lists in F# are and what they can be used for can be found here.

A Sample Application

I kicked around several ideas for an application to build and learn in, but eventually settled on making a version of Conway's Game of Life. Because F# is functional and because of that potentially can execute each of the cells in parallel, it seemed like a good match. Unfortunately, because of the way I wrote the program, and because I needed to update a user interface (producing a side effect!), I am instead iterating over each one of the cells. Maybe in a future version, instead of using the List.iter function to do the heavy lifting, I can instead write it in a way where I can use List.map, which implements function mapping, and, if we also do things such as stick to non-mutable values, opens up a program to parallelism. The main gist was to get my head around the language, and I'm definitely still learning! In some respects I approached the true spirit of the language, but a fair bit of it is still written in an imperative style. Maybe phase two.

I also wanted to hone my WPF skills a little. I wanted to originally write this with a Silverlight front end, but, alas, F# support for Silverlight is not there yet. I also wanted to see if I could write it in such a way that everything is backended in F# with a WPF front end. This is doable, but most examples use F# code to generate WPF objects, which is not what I wanted. I wanted to be able to design in XAML and have the code in F#. In order to do this, I had to create a C# shim application, which basically starts up and then as soon as possible hands over control to the F# code.

Give my creature Life!

You've probably seen versions of the game of life before, but the main idea is, depending on how many of its neighbors are active, a cell is either turned on (that is, 'born', turned off ('dies'), or remains the same. If a cell has two active neighbors, it stays the same. With three active neighbors, it turns on. Otherwise, it turns off. Pretty simple rules, but the patterns which emerge can be complex. One other note about Life is that it is one of the simpler forms of cellular automata.

A sample Life pattern

Here are some interesting patterns for the Life game.

This seed that produces a pulsar, which is a relatively complex shape that pulses between about four states:

The pulsar in one of its final forms :

This is an f-pentomino, a seed that does not stabilize in a short period of time, and produces some interesting results :

 

A "methuselah", which is a pattern that runs for an extended period of time before stabilizing:

After a short period of time, the above pattern grows to:

 

The code for this blog entry can be downloaded here. Enjoy! And, as this is definitely a new skill set for me, please feel free to tear the code apart, tell me how to do it better, etc.

 

Other useful links I found :

The F# Language specification

F# in 20 minutes – contains installation instructions

Bending Cats using F#

a notepad implementation in F#/WPF

ACO (ant Colony Optimization) in F#

a blog about F# and WPF, lots of useful information

lots of information on F#

F# lists and how to use them

 

Fun links :

"Ook", an example of a dysfunctional language (as opposed to functional)

an Ook compiler for .Net

hello world in Ook

Blog post about the task parallel library and F#

 

 

 

 

 

April 30
Computer vision processing in .Net, part II

In which we use AForge.Net to process images, find regions, and then use parallax techniques to determine range to those objects and their size.

In my last blog post, I discussed how you might be able to determine distance to objects by using two web cams. There are several ways that you might go about doing this, using such techniques as parallax, standard candle, and adjusting the camera’s focal length. The easiest way to do this was to get two cameras and use parallax to determine the distance between two objects as perceived by the cameras, and then doing some math to determine where that object is in three dimensional space. Well, two dimensional, anyway – currently what I have isn’t concerned with the y axis. It also is not interested in object recognition, although that is an interesting field in itself, and there are tangential connections.

Before we proceed into how to determine where objects are using parallax and two web cams, I have run across some other interesting things that are related. First, it is possible to do depth perception with just one camera. Pigeons do it, and so do a lot of other animals which don’t have their eyes pointing in the same direction. When pigeons bob their heads, or when praying mantises sway from side to side, they are using optical flow to determine where objects are. Some interesting research being done using this technique is using the AIBO, Sony’s robotic dog. The AIBO only has one camera, but it is very likely possible to use optical flow to determine where objects are on the AIBO platform. Information on a project researching just that can be found here. Oh and one more thing about pigeon sight - in humans, persistence of vision makes it so that if something refreshes above a rate of, say, 60 to 70 hertz, we do not see individual frames, but instead see a steady image. Not so with a pigeon. They can see individual frames up to a refresh rate of about 100 hz (see The Flicker Fusion Factor, which also claims humans are ill equipped to drive cars. Assumedly pigeons might do all right). That screen that you’re looking at right now? Totally flickering up a storm for pigeons.

Of other interest is object recognition, which I do not know much about yet. But, an interesting subject I ran across is that of geons, which in this context are geometric primitives. The idea with geons is that there are basic geometric shapes which the brain perceives, and then relates to one another in order to make a map of what it perceives optically (and supposedly by other means as well). To me, this sounds logical, although I’d like to see better proof, but it would fall into some of the basic ideas put forth in Artificial Intelligence: A Modern Approach, in which the authors talk about simplification of the problem domain in order to solve complex problems.

One other thing I have been reading about lately is Ant Colony Optimization – because the world needs more efficient  ant colonies! ACO is basically a highly parallelized search algorithm using multiple agents in order to find optima in n-ary space. (What? Was that English?) In other words, that’s “ants crawl around randomly until they find the goal, and then other ants follow”. The beauty is that in a situation in which there is more than one possible solution, or when the goal changes or the solution changes, ACO can come up with a solution, and keep coming up with solutions when one of the known solutions fails. Here’s an example, written in VB. It’s really kind of fun to watch.

One other thing on the cool & fun list – I have been poking around in a book on Google Books called The Computational Beauty of Nature, which is currently on my Amazon wish list. I may buy it, but then again I may soon have access to a kindle, in which case I may save a tree and just kindle it instead.

 

Cameras

A word on the hardware. The cameras I had picked up at WierdStuff in Sunnyvale Did Not Work Well With Others. They were old, and the driver – of which only an open source driver exists at this time – was incapable of supporting more than one camera on a single machine. Ok for a lot of setups, but not this one. So I broke down and got a couple of Lifecams, which worked perfectly, pretty much right out of the box.

AForge. Net

For this project, I am using a library called AForge.Net for (among other things) processing images, detecting motion, doing edge detection, and so forth. It is open source, written in C#, and has other libraries for such things as genetic algorithms, neural nets, and so forth. It’s a pretty interesting project to poke through. For efficiency, a fair bit of the bitmap-centric code seems to use unsafe code and pointers, but that’s ok, because it’s pretty fast for most things. There are also multiple write-ups on codeplex by one Andrew Kirillov, who is also the project owner for AForge. In order to process the images I am receiving from the two web cams (See my previous post), I am using the AForge libraries.

Steps to depth perception

After we have acquired two images from our cameras, there are several steps we need to take in order to determine where objects are in our images. One of the more common techniques in this process is to use binarization in order to create a black and white image from the source image, where each pixel is either black, or white. Most black and white images are grayscale images, where each pixel represents a sliding scale. When a picture is binarized, each pixel is converted to either an on or off state, depending on a threshold value. Over the threshold, it’s on. Under, it’s off. The piece of code in AForge.Net that determines where blobs are requires this, and it is a common step to take in computer vision systems (see also Computer Vision For Robotics:An Introduction). It allows images to be more easily manipulated in code, and is also another example of simplifying the problem domain in order to derive a solution. Once we have binarized the image, we can then search for objects in the images and compare them.

Another technique, which is a bit more processor intensive, is to use an edge detection algorithm to boil down the images into more succinct representations, and then do basically the same process as before.

AForge.Net has classes which allow us to do all of these. In my example, which you can download below, I have an option to use either of these techniques. The second, while potentially more accurate, is slower, but, because of my target hardware (an 800mhz Via C3 based machine that used to be a thin client terminal but now runs XP), is not necessarily suitable for its target environment.

Image Preparation

AForge.Net provides a wide array of image filters - classes which alter an image in some way. For instance, some might change the brightness or contrast, sharpening, and so forth. Many of these might be useful in what we are doing here, and in fact I have played with a few of them to varying effect. Some produce better results than others, and some produce good results, but at the expense of speed. Some of the ones I have found that work well are Threshold (which I use in this example), Pixellate, and CannyEdgeDetector. Others have proven interesting, such as ContrastCorrection, Dilatation, and CornersMarker, but after much experimentation I came up with two basic filter sequences I used :

Grayscale => Threshold => Invert, which is fast but sometimes inaccurate, and

Grayscale => CannyEdgeDetector, which is slower but sometimes more accurate

To see the code that does this preparation, see PrepareImage() in the accompanying code. Also, at the top of MainForm.cs, there is some commented out code for other filters that were interesting or useful.

 Once we have prepared the images, we can pass them to a detector of some sort in order to determine where objects are in the images. Aforge.Net provides several classes which do object detection of different kinds. For instance, there is an assortment of motion detectors (see here and here for examples) which can outline movement regions, and, in some cases, return an array of rectangles indicating where these motion areas are.

Put It Down, Flip It and Reverse It

For my first attempts at this, what I did was to use the CountingMotionDetector class, which outlines motion areas using rectangles, but also allows you to retrieve a collection of those rectangles. The thought was to get the rectangles from the images, flip the order that the images were processed in, reprocess and get the rectangles from the reversed order, and then calculate distances using the rectangles. Didn’t work. Unfortunately, the CountingMotionDetector class was too smart for me, and, no matter if the images were reversed in processing order or not, it always returned the rectangles at exactly the same positions. What I ended up doing, instead, is using the BlobCounter class to get rectangles from the individual images, and then match those rectangles with their counterparts. This is actually advantageous, because the BlobCounter class is lower level and faster. To see how the BlobCounter class works, see the GetRectangles() method in the accompanying code.

Matching Rectangles

Once we have rectangles in hand, the next step is to match rectangles, and then to do the math to determine distance. This is not as straightforward as it seems, because rectangles overlap, are not of the same size (but are of similar sizes) in the two images, there’s multiple possible matches, etc. What I did was use a bit of probability in order to determine which rectangles are the best matches for one another. Basically I scored the rectangles on how well they match one another, and then ran a LINQ query to determine the best matches, and bind these to one another.  The rectangles are scored on several criteria, such as how similar their areas are, how similar their widths and heights are, and so forth. Each score is a value between 0 and 1, inclusive, which I then add together to come up with a final score. This final score is obviously not in the range of 0 to 1, and therefore isn’t strictly in the realm of what we would classically think of probability, but, it does bubble up the highest ranking match to the top, which is what we need. Also note – this type of probabilistic reasoning, where instead of strict Boolean values we have ranges from 0 to 1, is a beginning point into fuzzy logic and how fuzzy reasoning works. In essence we are using a bit of fuzzy logic to match the rectangles – and using LINQ to do it! To see how all this works, check out the GetSimilarRectangles() and GetBestMatch() methods in MainForm.cs. There you will see LINQ queries to determine what the rank of matches are and how to get the top match. Look in GetNearScore() to see where the probabilistic logic takes place – and where I strayed from it! (For instance, the result of subtracting the abs() of the two y values is out of range for a canonically fuzzy value). You will also see some code that is commented out in there.  While those work well, I have found that the ones I left in are adequate and in general provide good results. The others I left in by way of example.

Doing the Math

From here, what we need to do is some crunching of numbers. The code that does this is in GetDistanceInfo(). An example of the kind of calculation we want to do can be found here. Basically what it does is calculate distance and angle, using the difference between the x values of the two rectangles, and the distance that those two rectangles are from the center line of one of our images. This gives us polar coordinates, with the cameras placed at the origin point. From there, we can manipulate the data as we please, for example plotting locations on a Cartesian graph (which is essentially what the example is doing). The calculation itself uses the disparity between the two objects (the perceived  x distance between them),the camera’s angle of view, and the camera’s focal length to determine polar coordinates for the perceived object. A discussion of this can be found here. From here, by way of example, I display the values on a graphic that looks like a radar screen.

Other interesting things I found along the way

Iplab_demo is an interesting example of the image processing capabilities of the AForge.Net library. It was very useful in determining what filter sequences worked well in the blob-finding process.

XDMessaging is a library which uses the windows message pump to communicate between processes/AppDomains on a single machine. It requires no setup and is relatively performant. I’m using it in the non-demo version of this code to communicate the vision info back to a consuming application.

An interesting and easy to digest page on stereopsis can be found here.

But most of all The Ants! You gotta check out the ants.

The Code

To play with the code for this blog post, download it here. While it is a demo and should work fine standalone, please consider it to be a work in progress. There are many things in it that could be improved upon. Also I used many references and examples to get this together. All cites and references should be intact in this document and in the code. If I missed anyone, please let me know and I will add ex post haste. Enjoy!

April 08
Computer vision processing in .Net

In my spare time, I have been playing around with vision processing. Basically, I have an ongoing robotics project I’m fiddling with off and on. What I need to figure out at this stage is how to do rangefinding, that is, finding the distance to objects near the robot. There are many options available, such as sonar, laser, and so forth. But, what I decided on doing, for various reasons, is to play with stereoscopic vision. The main reasons are that web cams are easy to find, and you can find older ones for a bargain. If, I thought, I could get two matching web cams, I may be able to use them to analyze images and determine where objects are and the distance to them. This is an ongoing project, and is far from complete (and with no guarantee of success!), but, I can report on some interesting things I have learned so far.

**Warning** I am far from an expert on any of this subject matter. Pretty much all of it is new to me, but I’m having fun exploring! Also, this describes a work in progress. If you see errors please feel free to point them out, I’ll certainly appreciate it!

How it works

I’m no expert on vision, but it seemed to me that with a little bit of trig and some known values (such as the distance between two cameras), then, assuming that you could somehow determine where objects were in images from those cameras, then you could do a little math and figure out where objects are in 3D space. After a little reading, it turns out that there are indeed several ways that this could be done.

Parallax

One of the ways that you can tell distance to an object by using imaging is to take two images at different locations, and then compare the two images in order to determine how far away they are. Objects in the images will shift depending on how far away they are – nearer images will shift more than ones that are far away. This is the method that astronomers use to determine distances to celestial objects, such as stars and galaxies. It is also one of the ways that we as humans use to determine how far away things are in our environment. For a fuller explanation of what parallax is and how it works, see the Wikipedia article. There is also a very good discussion of this and subjects related to it here.

So, it turns out that yes, we certainly can do what I stated above, in that if we can examine two images and figure out the positions of objects, we can then measure how much they have changed from one picture to another, and determine distance. The trick will be determining where objects are in the images. This may be possible, though, by processing the image using edge detection of some sort.

Standard Candle

Another way that we may possibly figure out how far away an object is, is by examining its luminosity. Photons radiating out from an object tend to dissipate over distance, for many reasons, such as hitting molecules in the atmosphere, the fact that they are radiating in an expanding sphere from a central point, and so forth. This leads to another way astronomers and people who use photogrammetry determine distance to objects. For instance, astronomers use red giants and supernovae as standard sources of light (standard candles) If we can determine what a standard source of light in our image is, and then compare other objects in the image to that known value, we may be able to determine distance. An advantage of this method is that we can do it with one camera. A disadvantage is that not only do we have to find objects in our image as before, but we also have to recognize what that object is in order to determine what its luminosity should be if it were at a standard distance. This requires knowledge of the problem domain that we may not necessarily be able to provide, and in fact probably cannot. Remember, we are after range finding here, not object recognition.

So standard candle is very likely out. It would require us to recognize objects to some degree, and also have knowledge of any environment that we may find ourselves in. That’s a big database and hefty processing power – and if we’re going there, buying a laser range finder at Lowe’s and hacking it might be an easier solution than all this vision stuff.

Focus

Another method we may be able to use is by using properties of the camera itself. Basically, if we use the focus of the camera and know what the focal point of the camera is at any given time, anything that is in focus is at the specified range. For instance, if we know the camera is focused at five feet, anything that is in focus is five feet away from us. This may work. The only problem is, is that the cameras I currently have cannot be programmatically focused. I would have to rig up a servo or a stepper motor and then alter the focus in that way. Doable, and probably fairly accurate if we can find edges correctly, but it means buying hardware and soldering things J.

So, parallax it is, for the moment, with the possibility of using the focus technique later, and probably not using standard candle techniques.

WIA and DirectShow

The first thing we need to do in order to make this happen is to get images from two cameras. The cameras will be mounted at a known distance from each other, be identical in make, and be calibrated to point in as close to the same direction as possible. A quick trip to WierdStuff netted me some cheap $5 web cams to play with. Unfortunately, there were some lessons learned here. WIA, (2) the Microsoft architecture for acquiring images from imaging devices such as scanners and web cams, proved to have some shortcomings that were insurmountable with the cameras I chose. Specifically, it appears that calls to capDriverConnect can only connect one imaging device at a time. That won’t do – we want both cameras up to speed and streaming video at the same time. Turning them on and connecting to them is a time consuming process. We will be unlikely to get realtime processing, with all this, but waiting 30 seconds to a minute between images is too much. Also, it appears that the drivers for the (very obsolete!) cameras I picked up assume that only one of these cameras are going to be connected to a computer at any given time, ever. If you connect two, Windows recognizes the second device, but the driver will only ever talk to the first one.

Unfortunate.  But, with more modern cameras, it is possible to use other technologies to have both cameras running at the same time and to talk to them simultaneously.

DirectShow

Fortunately, I did have some newer webcams. They don’t match, so they won’t do for the image acquisition in the long run, but they were enough for me to get up and running and write some code that will do image acquisition from two web cameras simultaneously. In order to do this, I used a library I found on SourceForge called DirectShow.Net, which allowed me to start up both web cams at the same time, begin image acquisition on them, and then acquire sample bitmaps from the image stream at predetermined intervals. The code I started with is from their July 2007 samples.  DirectShow.Net is a wrapper around the COM objects comprising DirectShow.  I can’t say I am fully up on graphs and pins and how to associate them correctly, but by using their DxSnap example (in the DirectShowSamples-2007-July\Samples\Capture\DxSnap directory), I was able to create an application that snaps two pictures simultaneously from two separate web cams. The gist of what I needed to do was set up two Capture objects. So, at the class level of the main form :

private Capture cam1;

private Capture cam2;

 

and then in the constructor:

 

cam1 = new Capture(1, VIDEOWIDTH, VIDEOHEIGHT, VIDEOBITSPERPIXEL, preview1PictureBox);

cam2 = new Capture(2, VIDEOWIDTH, VIDEOHEIGHT, VIDEOBITSPERPIXEL, preview2PictureBox);

 

The reason why I have values 1 and 2 above is because I also have a PCI TV tuner card in my dev box. Without that card, the values would be 0 and 1. And, of course, when this becomes a more robust program, those will be configurable values in a setting file of some sort. Because the target machine is relatively low power, and processing a web cam can be fairly process intensive, there were some other tweaks I did. Ultimately, the images are not for human consumption, so we do not need to look good. We can reduce the resolution (in the main form):

 

const int VIDEOWIDTH = 320;

const int VIDEOHEIGHT = 240;

 

And, since we don’t need to worry about flicker or problems with persistence of vision, we can drop the frame rate significantly. Normally, directShow will use the camera’s settings, which are usually the maximum it will support. We don’t need that. In fact, we can go well below 12 FPS if we so choose. What I did to do that, is in the Capture class, alter the SetConfigParams method to specify the frame rate. The VideoInfoheader structure, which defines how the video stream will be rendered. The AvgTimePerFrame property expresses how much time will be spent on each frame, expressed in nanoseconds. Since I wanted it to be approximately 12 FPS, I set it to 833000, which, 1 second in nanoseconds (1,000,000,000 per second) divided by 12 gives us 833,333.333, which I approximated. After these changes, the processor hit was much less intense.

 

After that, I created a small routine (using examples from the DxSnap example – thanks!) in the main form which I can call, specifying a capture device, which will snap a picture from the specified stream and return it :

 

private Bitmap TakePicture(Capture cam)

{

 

        // Release any previous buffer

        if (m_ip != IntPtr.Zero)

        {

                Marshal.FreeCoTaskMem(m_ip);

                m_ip = IntPtr.Zero;

        }

 

        // capture image

        m_ip = cam.Click();

        Bitmap b = new Bitmap(cam.Width, cam.Height, cam.Stride, PixelFormat.Format24bppRgb, m_ip);

 

        // If the image is upsidedown

        b.RotateFlip(RotateFlipType.RotateNoneFlipY);

        return b;

}

 

From here, it’s now a matter of processing images. I’ve used a good bit of some sample code in the DirectShow.Net samples download, but we’re now at a point where we can get images from two web cameras at regular intervals, which I’ve done by setting up a timer:

 

private void SnapshotTimer_Tick(object sender, EventArgs e)

{

        SnapshotTimer.Enabled = false;

 

        Bitmap b1;

        Bitmap b2;

 

        b1= TakePicture(cam1);

        b2=TakePicture(cam2);

 

        ProcessImages(b1, b2);

 

        SnapshotTimer.Enabled = true;

}

 

(We’ll get into what ProcessImage does in future installments!)

 

 From here, what we will need to do to meet our original goal of determining distance to objects by using the parallax method, we will need to do several things. First, we need to determine where objects are in the individual images. Once we have those locations, we need to correlate those to each other – that is, we need to know which images in one image correspond with images in the second. After that, we can measure differences between the objects we find and come up with numbers which we can then do computations on using what we know about computing distance via the parallax methodology.

 

Next :  Motion detection and object detection using AForge.Net

 

March 26
Implementing the A* algorithm in C# using LINQ and Generics

Some ongoing interests of mine have been things like robotics and artificial intelligence. In my spare time, I build small robotics projects just for fun, and have been involved in a larger robotics/art project, called OrbSWARM, which is a large scale kinetic art project experimenting in swarming behaviors and human/machine interaction. One of the current interests of mine have been implementing a controller program, in C#, which implements a hybridization of the agent/behaviors based approach to machine intelligence proposed by Marvin Minsky in his Society of Mind, and in other resources.

One of the ongoing problems in machine intelligence is to find a way for the machine to efficiently find a solution to specific problems presented to it. For instance, some of the cliché problems may be things like Towers of Hanoi, the travelling salesman problem, the knapsack problem, etc. For each of them, a typical way to implement a solution to the problem might be for the machine to find specific subgoals en route to the main goal. In my particular example, the program I am working needs to be able to find a route between two points on a map. Other parts of the program generate an internal map of the encountered world, represented as a bitmap, and yet other parts of the program choose a point where we desire the machine to be. The part of the program we are interested in here is the part that takes the map, and determines a route between where we are now, and where we would like to be.

There are many ways we can solve this problem, and each has its drawbacks and advantages. For instance, in an earlier iteration of the program, I used a binary search algorithm which used the bitmap itself, by drawing on it and sensing pixels directly to find paths between the goal and start positions. Basically, this works the same way the Paint bucket works in Paintbrush to paint an area, but starting at two points at once, and not filling in everything (it also uses a best-first approach). While this works, and has the advantage of allowing for stochastic variations in the resultant path, it is computationally intensive, and perhaps not suitable for lower power machines (such as Arm9, Transmeta, or Via Eden/C3 based machines) that a robotic controller program might find itself running on. For machines like these, efficiency in both memory and computational iterations really matters – the lower your O(n), the better.  Because of this, if we were able to represent the world in a more abstract manner, and have fewer data points to work with, it would be better. Also, abstracting the problem might allow us to reuse the code in more problems than just the current one – for instance, we’re talking about waypoints on a map for this specific problem, but what if we were able to apply what we build to generating workflows – for instance, a set of known delegates that point to methods that may or may not get us to a specific goal state? If we could assign costs to each one of those delegates, and determine what possible next steps there are for each, then an abstracted solution here might help us find solutions in that problem domain, as well.

The A* algorithm

Enter the venerable A* algorithm, one of the primary ways you might use to search a problem domain to find a solution. At a high level, the A* algorithm uses a best-first approach using a cost and a heuristic, to iterate through the nodes in a problem domain in order to find a path between a start state and a goal state.  Some definition of terms is in order. The cost in the above statement basically means the cost to get to a particular intermediate state. For instance, if we are trying to find a route on a map, the cost might be the distance travelled to get to a specific point. Cost can be things like distance, but can also take into account the difficulty of acquiring a particular state – for instance, if we had to go through an alligator infested swamp surrounded by thistle bushes, or worse, had to deal with a nest of the Dreaded Copperheaded Water Rattler, the cost is higher than a waypoint in an easy to traverse area. The heuristic is similar to the cost, but is more or less its inverse. The heuristic is basically a guess of what the remaining cost is to reach a specific goal. In our case, it may be the length of a straight line-of-sight line between the current state (a point on the map), and the desired goal state (our ending point).  The idea is that we will look at each possible next state, choose the one with the best possibilities (calculated as cost + heuristic), and examine that node – and this is what ‘best first’ means.

It would be nice if we could abstract some of that, and generics in C# allow us to do so.

Description of Node<>

In my implementation of A*, what I have basically done is divide the problem into several sections. The first, and easiest to implement, is a generic Node class. The Node class represents states in a particular problem domain. For instance, in our concrete example here, each node needs to keep track of possible waypoints in a path between physical points on a map, represented as the .Net Point structure. In order to be abstract about this, what I have done is to create a generic class which contains several properties which will help us track the node and our progress. I’ll highlight some of the interesting bits here.

Firstly, each node needs to know about the nodes that it is connected to.

public List<Node<T>> Neighbors { get; set; }

 

In our particular instance, this will be the points to the left, right, up and down directions from the current point, (if any exist there – if the map says there is a solid object there, we don’t create a point),

Secondly, we need to know what state the node is in. Later on, we will need to make decisions based on whether a node has been examined, is currently able to be examined, or has not been examined yet  (“closed”, “open”, and indeterminate state in normal A* parlance).

public NodeState NodeState { get; set; }

NodeState is an enum, defined as :

public enum NodeState

{

    Undetermined,

    Open,

    Closed

}

Next, we need to track what value this node represents, and it will also be useful to know what the goal value is. Also, in order to construct the path between start and goal states, each node in an A* scenario needs to track what its previous node was – what point we are adjacent to that was examined right before this one was examined.

        public T Value { get; set; }

        public T GoalValue { get; set; }

        public Node<T> PreviousNode { get; set; }

 

So that’s some basic housekeeping, and now for the interesting stuff.  As far as figuring out the cost and the heuristic, we can write code in the Node class that adds them together, but in order to abstract the definition of each of those, we really need to allow the consumer of our class to plug in a definition for each. Using lambda expressions would be nice, and the Func generic delegate allows us to do this.

        public Func<Node<T>, int> GetHeuristic { get; set; }

        public Func<Node<T>, int> GetCost { get; set; }

 

Also, because not all objects we will be dealing with implement IComparable (in this case, conspicuously, Points) we cannot constrict the Node class to something like IComparable or IEquatable<>, so, what we can do instead is define the method of obtaining equality elsewhere :

        public Func<Node<T>, bool> IsEqualFunc { get; set; }

 

All this will allow us later on to plug in our cost and heuristic in a more readily readable way. For instance, if we are dealing with a Node<Point>, we could do this :

Node<Point> point = new Node<Point>()

{

   GetCost = (n => Math.Abs(n.Value.X - startNode.Value.X) + Math.Abs(n.Value.Y - startNode.Value.Y)),

   GetHeuristic = (n => Math.Abs(n.Value.X - goalNode.Value.X) + Math.Abs(n.Value.Y - goalNode.Value.Y)),

   IsEqualFunc = ((n, m) => n.Value == m.Value)

};

 

- which effectively gives us the cost and heuristic based on straight line-of-sight distances we described above. To deal with hard-to-reach states, such as alligators and thistle bushes, we would alter the cost and heuristic for each node in the problem domain, as is appropriate. We could also refine what we have above – when using A*, it’s important to not over/underestimate the cost and heuristic.

Creating the domain

Now, on to more concrete stuff, for the moment. For our particular goal of finding a path between two points on a bitmap, what we need to do is define a set of nodes which represent our problem domain. If each pixel is a node, the process will become computationally intensive, and simplification of the problem domain is conducive to solving problems such as these – the human mind does things like this all the time. To do this, we’ll make a sample of the bitmap at regular intervals – for instance, we’ll check every fifth X and Y position, or every tenth, etc. This can work, if we know that the sample rate is less than the resolution of the map. For instance, if the smallest obstacle we may encounter on the map measures 10 by 10 pixels, then we can only sample up to 10 – otherwise we can get into a situation where we record a square (11x11, say), as being open and having no obstacles, when in fact we should have not recorded the point (meaning that there is an obstacle there).

The code for this is a couple of nested for loops

List<Node<Point>> domain = new List<Node<Point>>();

 

// get the available points

for (int i = 0; i < map.Width; i += resolution)

{

        for (int j = 0; j < map.Height; j += resolution)

        {

                if (map.GetPixel(i, j) == backgroundColor)

                {

                        domain.Add(new Node<Point>() { Value = new Point(i, j) });

                }

        }

}

 

(backgroundColor is a predefined color used on the bitmap which represents unused space, that is, no obstacles). From here, we need to iterate through all the nodes that we have thus far created, and determine what its neighbors are. For our example, we need to check if there is a node representing a point at each of the cardinal directions, relative to where we are. For instance, if we are examining a point at (1,1), we need to see if there are points in the domain we just created at

{ (1,0), (1,2), (0,1), (0,2) }

For each of those that exist, we need to add those to the Neighbors property of the Node we are examining. We could set up a few methods to iterate through the domain, and then for each, iterate through the domain again, searching for neighbors, but, LINQ is perfect for this.  We could do this in the same method as the for loops, but, just for clarity’s sake, I’ve broken this part out into a separate method:

private static void GetNeighbors(Node<Point> node, List<Node<Point>> domain, int resolution)

{

        var query = from n in domain

                where (

                        ((n.Value.X == node.Value.X - resolution) && (n.Value.Y == node.Value.Y)) ||

                        ((n.Value.X == node.Value.X + resolution) && (n.Value.Y == node.Value.Y)) ||

                        ((n.Value.X == node.Value.X) && (n.Value.Y == node.Value.Y + resolution)) ||

                        ((n.Value.X == node.Value.X) && (n.Value.Y == node.Value.Y - resolution))

                )

                select n;

 

        node.Neighbors.AddRange(query);

}

 

Also note, using LINQ, it is pretty easy to change the way we calculate neighbors.  For instance, if we wanted to have an X pattern to our search graph rather than a plus-shaped pattern:

var query = from n in domain

        where (

                ((n.Value.X == node.Value.X - resolution) || (n.Value.X == node.Value.X + resolution)) &&

                ((n.Value.Y == node.Value.Y - resolution) || (n.Value.Y == node.Value.Y + resolution))

        )

        select n;

Once we have a domain set up (and determine the start and goal nodes as represented in the domain from the physical points on the bitmap, omitted here for brevity), we can again deal with the problem solving in a more abstract manner.

The A* algorithm

One of the easiest to understand descriptions of the A* algorithm I have found is in the book AI for Game Developers, from O’Reilley, on page 129, which reads :

Add the starting node to the open list

While the open list is not empty

        {

                current node = node from the open list with the lowest cost

                if current node = goal node then       

                        path complete

                else

                        move current node to the closed list

                        examine each node adjacent to the current node

                        for each adjacent node 

                                if it isn't on the open list

                                        and isn't on the closed list

                                                and it isn't an obstacle then

                                                        Move it to the open list and calculate cost

        }

 

The discussion of A* in this book is a good one, easy to read with concrete examples – I would suggest reading it online to get an overview of how the algorithm works, coupled with the Wikipedia article on the subject. Basically, though, it first examines all open nodes, picks the best available, sets the state of the previous best node to ‘closed’, then sets the new best available to the current, rinse and repeat until you find the goal state.

If you look at the algorithm as outlined above, it is doing a bit of moving things from one pile to another. This can quite possibly lead to a more efficient approach than using LINQ, but, if you look at it from another standpoint, what it’s really doing is managing states on multiple subgoals until it builds a chain from the start state to the goal state. In a nutshell, it is performing a bunch of set operations and then manipulating the resultant sets – and that is something LINQ excels at.

In order to implement the A* algorithm in C# with LINQ, then, we need to iterate through some set operations. Let’s assume we have the domain created from before, and that for each item in the domain, we have set the lambda functions to get the cost, heuristic, and test for equality.  Let’s also assume that the pump has been primed as far as the ‘open’ nodes by setting the start node’s status to ‘open’. The first time we check for nodes, we will get exactly one – the start node.

From that starting point, we’ll run a loop until we either run out of options, or we find the goal state. The first thing we need to do in our loop is query the domain, and find all open nodes. From that list, we need to pick the best one. Our generic node object knows how to combine the cost and heuristic (both integers) into a total, so we’ll use that and pick the one with the lowest number. LINQ makes this easy :

currentNode = (

                from n in domain

                where n.NodeState == NodeState.Open

                orderby n.GetTotalCost() ascending

                select n

              ).FirstOrDefault();

 

From this, we’ll either have a Node, or null. In the case of null, A* has exhausted all possibilities, and we cannot find a solution. In the case of a value, we either have the goal state, or an intermediate step on a possible path to the goal. Assuming not null, what we should then do is use IsEqualFunc to see if we have the goal state, and, if not, we’ll need to prep for the next loop through. In order to prep, what we’ll need to do is set the current node’s state to ‘closed’, and then set all neighbors which have not been examined to the opened state. LINQ makes this easy, too :

currentNode.NodeState = NodeState.Closed;

 

//open the adjacent nodes that have not been examined

var neighborsQuery = from n in currentNode.Neighbors

                     where n.NodeState == NodeState.Undetermined

                     select n;

 

neighborsQuery.ToList().ForEach(n => { n.NodeState = NodeState.Open; n.PreviousNode = currentNode; });

 

Before we run another loop, we need to make sure that it is still possible to reach the goal state. If there are nodes with a status of ‘open’, then  there is a possibility that this is true.  We can use LINQ to check this, as well (note: the ‘success’ boolean may have been set to false elsewhere, hence the ‘&=’ operator)

searching &= (

                from n in domain

                where n.NodeState == NodeState.Open

                select n

             ).Count() > 0;

 

When you put all those steps together, along with some support code, and do a little rearranging so it fits into this blog’s editor, the search routine looks like this:

public static List<Node<T>> Solve(

                                Node<T> start,

                                Node<T> goal,

                                Node<T>[] domain,

                                Func<Node<T>, int> getHeuristic,

                                Func<Node<T>, int> getCost,

                                Func<Node<T>, Node<T>, bool> isEqual,

                                bool showProgress

                        )

{

        Node<T> currentNode = null;

        bool success = false;

        bool searching = true;

        List<Node<T>> result = new List<Node<T>>();

 

        start.NodeState = NodeState.Open;       //Add the starting node to the open list

 

        foreach (Node<T> node in domain)

        {

                node.GetCost = getCost;

                node.GetHeuristic = getHeuristic;

                node.IsEqualFunc = isEqual;

        }

 

        while (searching)

        {

                currentNode = (

                                from n in domain

                                where n.NodeState == NodeState.Open

                                orderby n.GetTotalCost() ascending

                                select n

                              ).FirstOrDefault();

 

                if (null == currentNode)

                {

                        success = false;

                        searching = false;

                }

                else

                {

                        if (currentNode.IsEqual(goal))// found the goal!

                        {

                                success = true;

                                searching = false;

                        }

                        else

                        {

                                currentNode.NodeState = NodeState.Closed;

 

                                //open the adjacent nodes that have not been examined

                                var neighborsQuery = from n in currentNode.Neighbors

                                                where n.NodeState == NodeState.Undetermined

                                                select n;

 

                                neighborsQuery.ToList().ForEach(n => {

n.NodeState = NodeState.Open;

n.PreviousNode = currentNode;

});

                        }

                }

 

                // if no more open nodes, stop - if we have not found

                // it yet, we can't get to the goal state

                searching &= (

                                from n in domain

                                where n.NodeState == NodeState.Open

                                select n

                        ).Count() > 0;

 

                // collect the current results and inform our listeners

                if (showProgress)

                {

                        result = GetResultList(goal);

                        if (null != SolverProgress)

                        {

                                SolverProgress(null, new SolverResultEventArgs<T>() { Result = result });

                        }

                }

        }

 

        result = GetResultList(goal);

        if (null != SolverProgress)

        {

                SolverProgress(null, new SolverResultEventArgs<T>() { Result = result });

        }

 

        return result;

}

 

Where to get the code

If you’d like to see the code for what I have described here, it is downloadable from here. Consider it a work in progress, with many false starts. There is also some fairly old code in there. The parts that are of interest here are in the ArtificialIntelligence.AStar assembly. To see it work, run the UI, and press the [A* Search] button. To see the binary search mentioned earlier, press the [Search] button.

March 25
Grid Computing in .Net

In a recent discussion, the subject of grid computing came up. If you’re into it, you probably know about Digipede, which is a company based in Oakland which provides libraries which allow you to create massively parallel programs in .Net . Digipede allows you to set up a small two node cluster for experimentation purposes, but more nodes cost money.  

 

There is however, an open source project called Alchemi, which has a programming model which is very, very similar to Digipede, but is open ended and free. I run Alchemi on my home network and have had good success with it.

 

Also, of course, there is MS Compute Cluster server, which is a canonical implementation of MPICH on the MS platform (and in my experience is fairly difficult to integrate into .Net):

http://www.microsoft.com/technet/ccs/overview.mspx

 

Also of interest, although not specifically grid computing, is the research language from MS research, and its asynchronous methods :

http://research.microsoft.com/Comega

http://en.wikipedia.org/wiki/C%CF%89  

http://research.microsoft.com/comega/doc/comega_tutorials_concurrency_extensions.htm

 

January 30
An example LINQ statement

In a project I am on, I recently created a real-world LINQ query that I thought was an interesting example. What I basically need to do is look at a library in SharePoint, determine if certain documents in that library meet certain criteria, and then iterate through those documents and perform some action. This could be done with a for/foreach loop, but, this is something LINQ does well.

In Sharepoint, when you retrieve data from a list or library, what you get back is a SPListItemCollection, containing - you guessed it - SPListItem objects, which contain information about list/library entries. This collection implements IEnumerable, but from my experimentation did not work well with the AsQueryable() extension method, which takes a collection and turns it into an object implementing IQueryable, so that it can be used in LINQ statements.

So, basically I have two problems to solve. One, get a SPListItemCollection into a state where I can query it. Secondly, execute my LINQ statement and do things with the results. (Note : CAML is also a solution here).

So, here's the LINQ statement :


var items = from i in SharepointHelper.GetListItems(0, "Documents", "").ToList()
                     where i["FileRef"].ToString().Replace("." + i["File_x0020_Type"], "").EndsWith("XX")
                      select i;


 

Let's take it apart a bit. 'var items' will hold the result set. It could also be expressed as IEnumerable<SPListItems>. SharepointHelper is a helper class I wrote for working with sharepoint libraries, and GetListItems() takes the name of a list or library as its parameter, and performs the neccessary actions to return the items in that list. It returns an SPListItemCollection, which unfortunately is unsuitable for use in LINQ. Because of this, I created an extension method which would return a List<SPListItem> :


public static List<SPListItem> ToList(this SPListItemCollection collection)
   {
      List<SPListItem> items = new List<SPListItem>();


      foreach (SPListItem item in collection)
      {
         items.Add(item);
      }


      return items;
}

The 'this' keyword in front of the first parameter denotes that this is an extension method, and will be appended to the list of available methods for objects whose types match the parameter. When this library is in scope of SPListItemCollection objects, a new method, ToList(), will appear in intellisense and be callable on that object.

The where clause is interesting in the LINQ statement. What I am doing is seeing if a file, sans extension, ends in “XX". There's nothing truly special about the statement - it simply evaluates to a boolean value, but, it demonstrates that LINQ is a step above SQL statements, in that the power of the .Net framework is accessible quite easily in LINQ statements. This could have very easily been a call to a RegEx object or a method returning a boolean.