Project idea – sensor integration

We discussed that it is crucial for cognitive systems to integrate sensor information. We have to keep in mind that there are many different possible ways to integrate information, while we usually think of situations where I integrate vision “big black thing from the right” with hearing “whistles like a train” into an perception of an “immediately dangerous train running me over in a second”, sensor integration can also happen on the temporal axis. What I heard a couple of seconds ago might put everything into a different context, like the sentence “I am going to show you a funny video”.

However, I suggest to study a system that has to integrate “visual” information with “olfaction”. I use the quotation marks deliberately to point out that vision and olfaction in natural systems are very different from the abstractions used here.

Imagine a T maze that doesn’t have walls but it more like a platform floating in space and when our agent steps off the maze it is dead. It has three forward facing “eye” that tell the agent if solid ground or the abyss is in front, which allows it to at least navigate around without falling down. The agent is also equipped with a “nose” that can sense the concentration of a pheromone or substance that the target located at the end of one of the T-maze arms secrets. A dykstra shortest path is computed along the route which tells you the concentration of this substance on any given location. The nose can now sense this and is in a on state with a likelihood of 1/distance and off otherwise. If the agent waits at a location it can integrate this frequency over time and figure out if this direction is better than another one.

The following picture is supposed to illustrate this experiment a little better. The triangle is our agent, and T marks the target (whatever it is). You can see from the numbers in the t-maze the concentration of the trace. While it looks hard it is super easy to calculate, and I can imagine to make the gradient steeper or more shallow, that needs to be seen. The two figures on the right show the frequency of the olfaction sensor firing in pink, and what the eyes could see. The one on the left which has all three eye sensors white is probably sitting right beneath the T while the right most sits somewhere else on the path and sees the path in front of it, but to the left and right no path. We can also think about a more elaborate nose of course.

sensorIntegrationOverview

The agent can turn left or right and walk one step forward. There are many different fitness functions possible: rewarding speed, rewarding accuracy, rewarding efficiency (not taking detours) …

I think one can test a couple of things in this system, the foremost the organization of the evolved brains and how much each Markov Gate is involved in which cognitive task. Also putting the agents later in more complex environments or having two targets tells us something about robustness, preadaptation, and so forth. Also doing the same task using an ANN might be interesting.

3 Comments

  1. Nathan Ward March 13, 2014 12:39 am Reply

    Here is my comment:

    I think this project seems the most interesting to me. We talked about the integration of our senses a little bit in class, and this seems to sort of address some of the topics that we mentioned ie: “how do our senses aggregate to form what we perceive”?

    In ANN’s this question isn’t really considered in any way. Input is input, and it doesn’t really matter what features constitute the input layer. For example, in speech recognition, spectral features can be used along with features extracted from a language model to predict words.

  2. Jory Schossau March 13, 2014 1:37 pm Reply

    Having more solutions than one path toward A or B would allow for other behavior, and I’m curious if they optimize as an algorithm, or wander to a point where they can do something dead simple.

  3. Hintze March 13, 2014 1:41 pm Reply

    As an alternative, you could feed redundant data but with different encodings in an ANN or MB and see what type/kind of encoding is preferred. As an example: if you want to feed a 3 into a network you can either pull three nodes to 1, or fire one node three times in sequence. After you evolve a meaningful task based on these inputs you ask which of the two types is preferred the sequential or the spatial one?

Leave a Reply