Genotype (format f4) 

This encoding scheme is an indirect developmental encoding, devised by Adam Rotaru-Varga and later upgraded by Maciej Komosinski (source code is available, see the last section).

Contents:

Overview

The f4 encoding resembles the f1 encoding, with an important conceptual difference: f1 is composed of codes which are interpreted as structural elements (sticks, neurons), or their attributes. On the other hand, f4 codes are interpreted as instructions to cells. An f4 genotype describes the developmental process of an organism. Development starts with a single ancestor cell, which starts to execute instructions from the start of the genetic code. As the cell divides, new cells are created, which execute different instructions in parallel (differentiation). The development stops when all cells mature, and the final shape of the creature is the result of the whole development process.
Developmental encoding models biological growth of an individual. Some features of a developed creature are not encoded by a particular gene, but by an interplay between developing parts. These interactions are modeled during the development process.

The developmental process. A developing creature is composed of interconnected cells. A cell is either a stick, or a neuron, or still undifferentiated – this is the type of the cell. Undifferentiated cells can turn into a stick or neuron (as a result of developmental instructions), but not the other way around.
Development starts with a single, undifferentiated ancestor cell, executing the start of the genotype.
During each step, each cell executes one (or more) instructions, in parallel. A new step is started whenever a division or a change in type occurs. Development stops when all cells stop changing. At this point, there should be no undifferentiated cells.

The minimal example. The minimal f4 genotype looks like this: /*4*/X (ex0). This says to the ancestor cell to turn into a stick (X), and stop development. The end result is a single stick – corresponding to f1 genotype "X".

Two-sticks example. /*4*/<X>X (ex1). This looks more interesting: < denotes cell division. It will create two cells, the first will execute the instruction immediately following the <, while the other will execute the instruction after the corresponding >. Now, it looks like < and > act like parentheses, but that's only superficial. In fact, > means "stop development".
The exact process looks like this:.

    step 0. Initially, there is the ancestor cell 1, undifferentiated.
    step 1. cell 1 executes < (division), creates cell 2 undifferentiated.
    step 2. cell 1 executes X, turns into a stick.
    (cont) cell 2 executes X (the other one), turns into a stick.
    step 3. cell 1 executes > – stops development.
    (cont) cell 2 stops development.
The end result is two connected sticks, the same structure created by f1 genotype "XX".

Details

An f4 genotype is identified by the prefix /*4*/. This is not part of the genotype, but a comment identifying that it is of type f4.

Most f4 codes are one-letter codes, with some exceptions. An f4 genotype is expressed as a string of codes. However, because division branches the sequence of instruction in two, the genotype can be conceptualized as a binary tree of codes. The string representation corresponds to a pre-order traversal. Note that an f1 genotype can be conceptualized as a tree as well.

Codes are the following:

  • < Division. Creates a new cell. The new cell will be connected to the old one. This is the only way of creating a new cell.

    After division, the two cells will execute different codes. The < is followed by the codes executed by the first cell (ending in a >), and then the codes to be executed by the second cell. Thus the code to be executed by the second cell can be found after the corresponding >. Note that both code sequences can contain further divisions. The general form is:

    < ...cell 1 code... > ...cell 2 code... >

    If there are n divisions in a genotype, they will create n+1 cells. This also means n+1 cell-stop markers >. However, the very last > can be omitted. This is because the last stop is not followed by anything else, and to have an equal number of < and > codes.

    Examples: ex3, ex4, ex5, ex6, ex7, ex8b.

    Another example: ex53. In this case, after the first division, the first cell will continue with the 'L' (pos 2.), while the second, freshly created cell with the 'l' (pos 9).

    Usually undifferentiated cells divide, and later differentiate into sticks or neurons. However, a neuron can divide, and in this case the new cell will be a neuron as well, with the same characteristics as the old cell (ex13). Existing neuron connections are also duplicated (ex13a, ex13b). Sticks cannot divide.

  • X Turn into stick. Turns the cell into a stick. The cell must be undifferentiated. It will remain a stick, since a stick cannot change its type (nor divide).
  • N Turn into neuron, a receptor or a muscle. Turns the cell into a neuron. The cell must be undifferentiated. It will remain a neuron, since a neuron cannot change its type.

    The full syntax is N:neuron_class_name. For example, N:N creates a standard "N" neuron. N:| turns a cell into a bending muscle (examples: ex10, ex12, ex51). N:@ turns a cell into a rotating muscle (examples: ex11, ex14, ex52).

  • > Stop development of a cell. The cell should not be undifferentiated.
  • , Increase branching angle (comma). Increase the branching count (angle) of future divisions. Changes takes effect when the divided daughter cell turns into a stick. Cell can be undifferentiated or a stick. Example: ex3.
  • Modifiers:
    • 'L'/'l' Increase/decrease length of stick. Works with sticks and undifferentiated cells.
    • 'R'/'r' Increase/decrease rotation by 45 degrees. Works with sticks and undifferentiated cells.
    • 'C'/'c' Increase/decrease curvedness.
    • 'Q'/'q' Increase/decrease twist.
    • 'F'/'f' Increase/decrease Part friction.
    • 'M'/'m' Increase/decrease muscle strength.

    Less useful properties:

    • 'W'/'w' Increase/decrease stick weight (in water only).
    • 'I'/'i' Increase/decrease ingestion.
    • 'A'/'a' Increase/decrease assimilation.
    • 'S'/'s' Increase/decrease stick stamina.
    • 'E'/'e' Increase/decrease stick energy.

    To learn how the values of these properties depend on the sequence of the codes, read about modifiers in the f1 genotype format.

  • Neuron properties can be modified by a sequence of increasing/decreasing codes, currently only supporting the "N" neuron. (Note that in f1 these are set using concrete numerical values). These codes are of the form ':+X:' or ':-X:' for increasing and decreasing respectively.
      ':+!:' Increase neural force, by (1.0 – force) * 0.2.
      ':-!:' Decrease neural force, by force * 0.2.
      ':+=:' Increase neural inertia, by (1.0 – inertia) * 0.2.
      ':-=:' Decrease neural inertia, by inertia * 0.2.
      ':+/:' Increase neural sigmoid value, multiply by 1.4.
      ':-/:' Decrease neural sigmoid value, divide by 1.4.
  • [...:...] Add a neural connection. Adds a neural connection to a neuron (cell must be neuron). The format is: [ input_connection : weight ].
    Input connection is an integer number, interpreted as the relative reference to the neuron where the connection is originating. 1 means the "next" neuron, –1 the previous, 0 this one, 3 three from here on, and so on. Note that relative references are based on the current developing structure, and not the final one!
    The weight is a real number representing the weight of the connection. Examples: ex9, ex11.
  • # Repetition marker. This code allows certain other codes to be repeated more than once. The explanation of this code requires certain general details, left out from the discussion so far for the sake of clarity. This code is quite tricky, but it is also powerful: it can create repetitions of the same codes (and thus substructures) without the duplication of the codes themselves.

    Each cell has a pointer to the currently executing code. But they have another pointer, which is the "to repeat" pointer. They also have an associated "to repeat" counter. The # code creates a branching in the genotype tree, much like the < (it does not create new cells though). It is also associated by a number, like in #3. The action is this: set the repeat counter of the cell to the number associated with #, and make it execute the first child. However, make it also remember it in the repeat pointer. For repetition to actually work, we have to modify the semantics of the > as well: if the repeat counter is 0, it means regular "stop", as described above. However, if the counter is not 0, it is decremented, and if it still not zero, the cell "jumps" back to the repeat pointer. If the repeat counter gets to 0, the second child of the # is executed to finish off.

    What all this means is that a genotype of the form

    #n ...repcode... > ...endcode... >

    means that the repcode part will be repeated n times, and the endcode once in the end.

    Since endcode also requires > as the final delimiter to avoid ambiguity, the genotype that uses the # repetition markers requires more > than < codes.

    There's one more detail: when a cell divides (<), only the new cell inherits the repeat counters, and not the old one. Thus only one cell will continue the repetition.

    Examples: ex8a, ex15, ex16, ex17, ex54, ex55, ex57, ex58.

Genetic operators

Mutation produces some localised changes in an f4 genotype. Mutation operates on the tree representation of a genotype. A single mutation can change a code, change a parameter of a code, add a new code (division, neural connection, repetition marker, or simple code), or delete a code. A mutation operator can consist of several single mutations. The amount of change of a mutation is estimated from the ratio of number of changed codes and total number of codes.

Crossover operates on f4 genotype trees, and exchanges two subtrees of two genotypes. The size of the subtrees varies, between 10 and 90 percent of the genotype.

Examples

Example no. f4 genotype Corresponding f1 genotype – approximate: for example, f1 does not allow to continue after branching: X(X,X)X; some modifier genes are also not perfectly converted Description
ex0 X X a single stick
ex1 <X>X XX two sticks connected one-after-the-other
ex2 <<X>X>X X(X,X) two sticks connected to the same third one (branching)
ex3 <,<,,<,X,,>X>X>X X(,X,,,X,,X,,) a 3-way branching with different brancing angles
ex4 <X><X><<X>X><X>X XXX(XX,X) more branching
ex5 <<X><<X>X>X>X X(X,X(X,X)) more complex branching
ex6 <X>l<X>l<<X>X>LLLX XlXlX(LLLX,X) different length factors
ex7 <<X>RR<<X>X>X>X X(X,RRX(X,X)) branching with rotation R
ex8a #3<X>lC>X XlCXlCXlCX repetition, C
ex8b <X>lC<X>lC<X>lCX XlCXlCXlCX curvedness C
ex9 <X><N:N[1:2]>N:N[-1:3] X[N,1:2][N,-1:3] a stick with two neurons attached to it. The two neurons are connected to each other
ex10 <X><X><N:|[1:2]>N:N XX[|,1:2][N] a structure with a neuron and a bending muscle
ex11 <X><<,<X,><N:@[1:30]>N:G>X>X XX[@,1:30][G](X,,X,) a structure with a rotating muscle and a gyroscope (tilt) sensor
ex12 <<X>N:|[-1:2]>N:G X[G][|,-1:2] a stick with a bending muscle and a gyroscope (tilt) sensor
ex13 <X>N:Sin<><>> X[Sin][Sin][Sin] three "sinus" neurons
ex13a <X><N:*>N:N[-1:10]<><>> X[*][N,-1:10][N,-2:10][N,-3:10] multiplicating connections upon neuron division
ex13b <X><<<N:G>N:N[-1:0.1][-2:0.2][-3:0.3]<><><>>N:T>N:S> X[G][S][T][N, -1:0.1, -2:0.2, -3:0.3][N, -2:0.1, -3:0.2, -4:0.3][N, -3:0.1, -4:0.2, -5:0.3][N, -4:0.1, -5:0.2, -6:0.3] a fully-connected two-layer neural network; three sensors (G, T, S) in the first layer and four N neurons in the second layer, 3×4 connections
ex14 <X><<X>N:|[2:1.2]><X><N:@[1:2.3]>N:G XX[|, 2:1.2]X[@, 1:2.3][G] various neurons and connections
ex15 #10,<<qX>X>>LLX X(,,X(,,X(,,X(,,X(,, X(,,X(,,X(,,X(,,X(,LLX, X)X)X)X)X)X)X)X)X)X)

(not updated with q)
repeating division 10 times
ex16 #10,<<X><<X>X>X>>LLX X(,X(,X(,X(,X(,X(,X(, X(,X(,X(,LLX,X(X,X)), X(X,X)),X(X,X)),X(X,X)), X(X,X)),X(X,X)),X(X,X)), X(X,X)),X(X,X)),X(X,X)) repetitions with even more divisions
ex17 rr<X>#9<,<X>RR<<llX>LX>LX>>X rrXX(,X(,X(,X(,X(,X(,X(,X(,X(,X, llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)), llRRX(LLX,LLX)) a worm-like creature, composed of repeated identical segments ("millipede")
ex51 <X><X>N:| XX[|] two sticks and a bending muscle between them
ex52 <X><X>N:@ XX[@] two sticks and a rotating muscle between them
ex53 <LL<X>X>llX LLX(lllLX,X) 3 sticks in a "Y" formation
ex54 <<X>X>#6<ccX>>X X(cXcXcXcXcXcXcX,X) a curved tail made of repeated segments
ex55 <<X>X>#4<<LLLLX>#2LLL<<<<X>X>X>X>>X>>X X(,LLLLX(LLLX(LLLX(LLLX(llLX, LLX(,,,LLLX(X,X,X,X)XXX)), LLX(,,,LLLX(X,X,X,X)XXX)), LLX(,,,LLLX(X,X,X,X)XXX)), LLX(,,,LLLX(X,X,X,X)XXX))X) two-level tree made of repeated segments
ex57 <X>N:N[0:1]#5<>>> X[N, 0:1][N, -1:1][N, -2:1][N, -3:1][N, -4:1][N, -5:1] one neuron as an input to a layer of five other neurons
ex58 <X>N:N#5<[1:1]>>> X[N, 1:1][N, 1:1][N, 1:1][N, 1:1][N, 1:1][N] one neuron as an input to a sequence of five other neurons

Related resources

Technical details

Sources of this encoding are part of SDK with some remaining "TODO" issues if you want to help and contribute.

A sample development of cells from the /*4*/N:N#5<[1:1]>>> genotype to the corresponding phenotype, step by step (generated by the source code above), looks like this:

Tree after parsing the genotype string:

< (9)
 X (1)
  > (0)
 N:N (6)
  #5 (5)
   < (3)
    [ (1)     from=1    weight=1
     > (0)
    > (0)
   > (0)



Development of cells:

------ Initialization                                          ------ errorcode=0, errorpos=-1
 0(no progress)  nr=0    type=undiff             genot=<         gcurrent=<

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(progress)  nr=0       type=undiff             genot=<         gcurrent=X
 1(no progress)  nr=1    type=undiff             genot=<         gcurrent=N

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(progress)  nr=0       type=STICK              genot=<         gcurrent=>
 1(progress)  nr=1       type=NEURON:N           genot=<         gcurrent=#

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(finished)  nr=0       type=STICK              genot=<         gcurrent=null
 1(progress)  nr=1       type=NEURON:N           genot=<         gcurrent=[     from=1  weight=1
 2(no progress)  nr=2    type=NEURON:N           genot=<         gcurrent=>

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(finished)  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(progress)  nr=2       type=NEURON:N           genot=<         gcurrent=[     from=1  weight=1
 3(no progress)  nr=3    type=NEURON:N           genot=<         gcurrent=>

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(finished)  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(progress)  nr=3       type=NEURON:N           genot=<         gcurrent=[     from=1  weight=1
 4(no progress)  nr=4    type=NEURON:N           genot=<         gcurrent=>

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(finished)  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(progress)  nr=4       type=NEURON:N           genot=<         gcurrent=[     from=1  weight=1
 5(no progress)  nr=5    type=NEURON:N           genot=<         gcurrent=>

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(        )  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(finished)  nr=4       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=5 weight=1
 5(progress)  nr=5       type=NEURON:N           genot=<         gcurrent=[     from=1  weight=1
 6(no progress)  nr=6    type=NEURON:N           genot=<         gcurrent=>

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(        )  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(        )  nr=4       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=5 weight=1
 5(finished)  nr=5       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=6 weight=1
 6(finished)  nr=6       type=NEURON:N           genot=<         gcurrent=null

------ Development step                                        ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(        )  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(        )  nr=4       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=5 weight=1
 5(        )  nr=5       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=6 weight=1
 6(        )  nr=6       type=NEURON:N           genot=<         gcurrent=null

------ After last development step                             ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(        )  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(        )  nr=4       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=5 weight=1
 5(        )  nr=5       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=6 weight=1
 6(        )  nr=6       type=NEURON:N           genot=<         gcurrent=null

------ Final                                                   ------ errorcode=0, errorpos=-1
 0(        )  nr=0       type=STICK              genot=<         gcurrent=null
 1(        )  nr=1       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=2 weight=1
 2(        )  nr=2       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=3 weight=1
 3(        )  nr=3       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=4 weight=1
 4(        )  nr=4       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=5 weight=1
 5(        )  nr=5       type=NEURON:N           genot=<         gcurrent=null
        conn:0 from=6 weight=1
 6(        )  nr=6       type=NEURON:N           genot=<         gcurrent=null


...and this set of 7 differentiated cells is then converted to an f0 genotype.