Ingeneue Tutorial #3

Writing an Ingeneue Affector

One of Ingeneue's primary strengths is the way in which it handles building the equations for a network model. Ingeneue contains separate mathematical terms representing different genetic processes. To get the equation for the change over time in the concentration of a network element, you can mix and match terms representing different processes affecting that element. The nice thing about this scheme is that there are a fairly limited number of potential processes affecting genetic elements. mRNA's are translated into proteins. Proteins can bind to one another or modify each other through actions like phosphorylation. Proteins can also activate or inhibit transcription. And there are quite a few more, of course, but the list of common interactions is not that large. Currently Ingeneue has about 50 different mathematical terms, and these 50 have been enough to model 2 fairly complex networks. We guesstimate that perhaps 100 terms will be enough to model all but the most obscure processes.

As an example of how these terms work, lets say a protein X is getting produced from its mRNA, degrades over time, and heterodimerizes with another protein Y. The full differential equation for X might look like this:

where Tmax is the maximum rate of translation, RNA(x) is the concentration of X's mRNA, wXY is the rate of dimerization between X and Y, and Hx is the half-life of X. This equation can easily be split as follows:







In Ingeneue, these terms are given the name Affectors - they encode how various processes "affect" the concentration of some network element. For each of the processes given above, an Affector already exists in Ingeneue. You can look up the appropriate Affectors in the html documentation files. To make X's full equation you would put the following lines into an Ingeneue network file:

       &TlnAff      X     H_X
       &HeterodimerizeA_ICAff  X    Y    W_XY    max_X
       &DecayAff    X     H_X
<rest of network...>

In programming-speak, Ingeneue has each of these terms as a separate Affector class. It then instantiates a copy of the appropriate class for each Affector given in the input file. What this means in English is that each term has a small file of code that calculates the value of the term. Ingeneue will make a copy of this code for each use of that term. Even if you are not much of a programmer, you can learn how to write the code for one of these Affectors. This can be useful if your network includes a process not yet covered by any of the built-in Affectors. Its also useful just so you can read the formulas in the current set of Affectors, in case you want to be absolutely sure of what a particular Affector is doing.

In order to make your own Affector, you'll need to have a way of compiling Java code. You can do this either manually from the command line (not recommended) or use a Java development environment such as Eclipse or Omnicore's Codeguide. You will need to read the documentation that comes with your development tools for how to set it up to compile Ingeneue. Be sure that you compile with -version 1.4.

Once you have a programming system up and going, find the file called and open it. I'll take you through this file briefly just to point out the things you'll have to write for your own Affector.

There are three important sections to an Affector's code. If you look near the top of the file, you'll see a long comment that describes what the Affector does, its parameters, and how to use it. Just below the comment, the first section of code is essentially more documentation on the function. There is a short description of the Affector, names for each of the Nodes used by the Affector, and names for each parameter used by the Affector.

The middle part of the file, starting with

    public DecayAff() {

and ending with the method

    public setParameterNumbers()

is involved in setting up the Affector and sending information about the Affector to the rest of Ingeneue. This is all boilerplate code that is uninteresting but easy to put in.

The last part is the heart of the Affector. The method getValue() returns the current value of the Affector, that is, the rate at which the concentration of this Affectors' Node should be changing based on the term encoded by this Affector. The getNCValue() method is almost identical to getValue() - the difference is described below.

Go ahead and close the Decay Affector file. To begin, let's write the simplest transcriptional Affector, one which just transcribes based on the concentration of some activator. This is a nice Affector to start with because it is relatively short and is completely contained inside the cytoplasm (no cell-cell interactions). The second example in this tutorial will show you how to use membrane-associated elements.

We decided activators in genetic networks could be characterized by three properties: (1) They can activate up to some maximum level of transcription set by the enhancer region and ultimately set by the rate at which the transcriptional machinery can move along the DNA. (2) There is a concentration of activator at which the gene is being transcribed at half its maximal rate. (3) There may be some non-linearity in the rate of transcription as the activator concentration increases.

One of the simplest functions that will satisfy the above is the Hill equation:


where Tmax is the maximum rate at which this gene can be transcribed, A is the concentration of activator, K(A) is the activator concentration at which it half-maximally activates transcription, and Va is the degree of non-linearity in the activation curve (we also refer to this as cooperativity as this non-linearity will often come from cooperative interactions). This function is what you will implement in your new Affector.

The formula you'll actually use, though, is a bit different than the one above because we do something called "non-dimensionalizing" our equations. Each of the variables and parameters in the equation above has certain dimensions. The left-hand side has dimensions of [Gene] / time, where [ ] indicates concentration. A has units of concentration of activator ([A]), as does K. Tmax has units of [Gene]/([A]*time). Only V is a straight value with no units. Non-dimensionalizing, as its name implies, means removing all the dimensions from the equations to come up with a new set of equations where all the variables and parameters are unit-free. The main reason to do this is that it often greatly reduces the total number of parameters in the model - in the case of Ingeneue models, non-dimensionalizing gets rid of somewhere around one quarter to one third of the parameters. Since more parameters make models harder to analyze, this is a big win. This seeming magic occurs because most equations have parameters that are redundant with each other as far as the qualitative behavior of the system goes. If you combine these parameters together into ratios, you will get the same qualitative behavior with fewer parameters.

The way you non-dimensionalize is to multiply and divide equations through by certain of their parameters or variables. Figuring out a good non-dimensionalization for a set of equations is simple in theory, but something of an artform in practice. This tutorial isn't going to teach you how to do it, nor explain exactly why we picked one particular non-dimensionalization scheme for our network equations. For the former, consult any good mathematical biology text. For the latter, you can look in the supplement to our Nature 2000 article, available at the same website as the Ingeneue software. Even without understanding our non-dimensionalization, though, you can get your equations right by following these three rules:

a) All Affectors must be multiplied by a variable called Globals.characteristicTime. That means whatever the final value your Affector is going to return, before returning it you should multiply by the characteristic time.

b) All Affectors which are either producing or destroying things must be divided by the half-life of the Node they are affecting. This means transcriptional terms, translational terms, and degradation terms are all divided by the half-life of the element being transcribed, translated, or degraded.

c) The Tmax term, as well as the corresponding term in translation (maximum translation rate) disappear. These were the terms that our non-dimensionalization got rid of.

[As an aside, note that these three rules do not apply to Affectors which sit inside of the Sum and Product meta-affectors. To see the reason you will need to read through our supplement and then work through the non-dimensionalization for yourself].

Applying those rules to the transcription equation above, we get:


Where to is the characteristic time, Hnode is the half-life of the target node (the mRNA being transcribed), and A and K are now scaled, but can essentially be thought of the same as they were in the first equation. The Tmax term has disappeared. This new equation is what we'll use in building a first Affector.

The rest of this tutorial is set up as two programming exercises which lead to building two Affectors. There is not room in the tutorial to explain how to program. While you don't need to be an expert programmer in order to write Affectors, you will need to know a few rudimentary things. You'll need to have a programming environment from within which you can write and compile Java code. You should also know about a few very basic programming concepts: declaring and using variables, what an array is and how to use it, what a method is (called functions in other programming languages) and how to use one. There are many introductory Java books that will introduce you to all these concepts.

1. Start up your programming environment or a text editor.

2. Load the file

3. Scan through the comments in the file to get an idea of what we're going to do.
The easiest place to start is to pick a name for your new Affector. You can name it anything you want, but let me suggest SimpleTranscription (substitute your own name below if you wish).

4. Search for the two instances of TemplateAff that are not in comments, and replace them with SimpleTranscriptionAff. One instance is near the top of the code where the class is declared. The other instance is the constructor method for the class, about half-way down the file.
Just for future reference, the name of every Affector needs to end with Aff.

My usual method for building Affectors is to begin by coding the equation itself. Then I fill in the other parts based on what I needed for that equation. The equation goes into a method called getValue() which is located near the bottom of the file. In the template it contains a simple example formula. In the next few steps you'll replace this example formula with the formula given above.

The first thing to do is get the current concentration of the activator. As you'll recall from previous tutorials, the elements of a network are called Nodes in Ingeneue, and Affectors have access to the concentrations of any Nodes they use through an array called (surprisingly) Nodes[]. The Nodes used by the Affector will be named in the input file, and are then stored in the Nodes array in exactly the same order as they are named. This particular Affector only needs a single Node, that of the activator. So the activator Node can be referenced as Nodes[0].

5. Find the getValue() method and delete everything currently inside of the curly brackets {}. (Make sure you have found getValue() and not getNCValue()). Make a new line to begin typing your new formula.

6. Start by making a variable called activator that stores the current concentration of the activator. To get the current concentration of activator, you can call the Node method getIntegrationValue(side). So the code will look like this:

            float activator = Nodes[0].getIntegrationValue(side);

Next you should calculate how close to maximally active the transcription of the gene is. This is the part of the equation inside the parenthesis. The quantity will range from 0 when there is no activator present to 1 when the activator concentration is extremely high relative to its half-maximal activation concentration. You will need the values of two of the parameters to do this part of the equation. As with the Nodes, the parameters used by an Affector are also stored in an array, in this case called params[]. Unlike the Nodes, this array of parameters is shared between all the Affectors in a model, however, since many Affectors will be using each parameter. So rather than knowing exactly which position in the array corresponds to each parameter, you will instead have a variable (which you'll write the code to set in a few steps) that stores the array position for each parameter. You should give these variables good names. For instance, you might use the following for the three parameters:


7. Make a new float variable called proportion_active. Set proportion_active to the proportion of its maximum rate that the gene is being transcribed (the part of the formula inside the parenthesis). This means you will need to write an equation on the right hand side of the following code:

            float proportion_active = 

In this formula you will need to raise some numbers to a power. You can do this with the method:

            (float)Math.pow(num, exponent)

[Answers to all exercises are at the bottom of the tutorial].

8. Make another variable called value. Set value equal to the full formula for transcription. Remember that the characteristic time is given by Globals.characteristicTime.

9. Return value from the getValue() method by adding the following line at the end:

             return value;

You now have the most important part of the Affector written. The rest is just cookbook code to set things up correctly. Next let's skip to the top of the file and work down. You need to give a short (one to a few word) description of the Affector and of each Node and parameter that the Affector uses. These go into the affectorDescription, nodeDescriptions, and paramDescriptions arrays respectively, which you'll find near the top of the code. Although these descriptions will not change how your model runs, I'd encourage you to do a decent job with them because they will serve as reminders to the Affectors function when you come back and look at it several months from now. Whether or not you make decent descriptions, you do need to get the number of strings in the nodeDescriptions and paramDescriptions arrays correct. The rest of Ingeneue uses the number of strings to decide how many Nodes and parameters this Affector is going to use, and therefore how many items ought to be on the input line for the Affector in a network file.

10. Type in descriptions for the Affector, the Node used by the Affector, and the parameters used by the Affector into the three sets of descriptions.

11. Just below the descriptive strings is an array called whichSides. This is used for Affectors that deal with membrane-bound Nodes. You'll use it in the second example below, but since this Affector doesn't worry about sides, you can delete the whichSides array.

The next method which you need to change is called setLabelsAndTypes(). The point of this method is to give a bunch of information to the rest of Ingeneue about this Affector. The rest of Ingeneue does not know anything about different Affectors in advance of making one. In fact, it doesn't even know which Affectors are available. This is a powerful feature of the way Ingeneue is constructed (and of object-oriented programming in general) because it means you will not have to change any other part of Ingeneue's code in order to add your new Affector. But it also means that the Affector's code needs to pass back a bunch of information about itself.

The comments in the template file are pretty complete. Rather than repeating them here, read through those comments and then try to take out the parts which don't need to be there and modify any of the remaining parts as needed.

12. Fix up the setLabelsAndTypes() method so it contains what's needed for your Affector.

The final pieces of code you need to write are to set the variables which store the positions of each parameter. Recall from step 7 that there were 3 variables (kParam, nuParam, halfLifeParam) which were indexes into the params[] array. The position of each parameter in the params[] array is not known in advance by an Affector. The rest of Ingeneue informs it of the correct indices using the setParameterNumbers() method. This whole mechanism is a bit tricky. I give an optional explanation here, but if you just want to plug in the right code you can skip past this next paragraph to what you actually have to do.

Optional: The params[] array and the param_nums[] array are a bit of a trick to let Affectors share parameters without the rest of Ingeneue needing to have any advance information about which parameters are going to be shared. What happens is that the input code to Ingeneue scans through the file with the network. As it reaches each parameter, it checks to see whether the name of the parameter is new, or whether a previous Affector has already used that name for a parameter. If its a new name, it gets added to the end of the params[] array. If its already in the params[] array then nothing happens. Next Ingeneue goes back and figures out where in the params array each Affectors parameters are. It then puts these positions into the param_nums[] array and passes that into the setParameterNumbers() method for the Affector.

What to Do: The rest of Ingeneue passes in the indices to each parameter in an array called param_nums[] in the setParamNumbers() method. The indices are listed in param_nums[] in the same order as they will appear on the input line when you use the Affector. So for instance, if your Affector will be typed in as:

&SimpleTranscriptionAff   ACT_NODE   K_ACT     nu_ACT    H_rna

then the param_nums[] array will contain the index value for K_ACT in param_nums[0], the index value to nu_ACT in param_nums[1], and the index value to H_rna in param_nums[2]. You must set each of your index variables to the appropriate index from the param_nums array.

13. Using the example code as a guide, put three lines of code into setParameterNumbers() that set the three parameter index variables to their correct values. Make sure to erase the already existing example code when you're done. You now need to declare the three index variables as int's. There is an example of declaring two such variables already near the top of the template file.

14. Replace the example declarations of parameter index variables with declarations of the index variables you used in this Affector. If you want, you can put a comment above each giving an explanation of what that parameter does. Note that these comments start with /** - this will include them in the online documentation file for this Affector.

The last thing you should do is write a longer comment at the top of the file giving a full description of the Affector. You can use the example comment as a guide to the information and format we use. This is especially important for Affectors where there is some subtlety to the formula or to its use.

15. At the top of the file is an example of how we write our comments for each Affector. Fill this in with the appropriate information for your Affector.

That's it. You should now have a working Affector. You will need to add it in to the rest of Ingeneue's code and compile it. How you do this depends on your programming environment. If you got everything right, you should then be able to use your new Affector in a network. You might try replacing the Txn1 Affector in a network you built or used from one of the other tutorials with the SimpleTranscription Affector. The network should behave identically.

Cross-membrane Affectors

The major complication we ignored in this first example of building an Affector is how to make one that works between cells. For instance, what if a ligand from one cell is binding to a receptor from another, or a protein is diffusing from the membrane of one cell to its neighbors. The second example in this tutorial takes you through building an Affector which works across membranes. The fact that Ingeneue can automatically connect things across cells avoids what would otherwise be a major headache. The scheme we use may seem a bit baroque at first, but is fairly straight forward once you get the hang of it.

All communication between cells in Ingeneue takes place between the adjacent faces of neighboring cells. Normally we use hexagonal cells, with the sides numbered with the top being side 0 increasing clockwise.

The exact numbering isn't important. What's important is that any given cell face has an opposing cell face from a neighboring cell with which it interacts. So, for example, face 2 of one cell will interact with face 5 of its neighbor. For any given cell face, we can refer to Nodes that live within that face and Nodes that live in the cell face adjacent to it. An Affector that mediates an interaction between cells will then specify whether each Node it uses is supposed to be in the same cell face as the Node being changed, or in the adjacent face.

I'll take you through building a dimerization Affector that creates a complex of a receptor from one cell with a ligand from the neighboring cell. The complex will reside in the same cell face as the receptor. The ligand comes from the opposing cell face. This whole setup would actually take three Affectors - one for the complex, one for the receptor, and one for the ligand. But I'll only take you through the Affector for the complex, and then let you try the other two yourself if you wish (or you can look at the already built Affectors that serve this purpose to see how the other two would come out).

The formula for dimerization is quite simple. The rate at which the complex is formed is proportional to the amount of receptor and to the amount of ligand, as follows:

dComplex/dt = dimerizationRate * Receptor * Ligand

Although this should only take a single parameter, our non-dimensionalization forces a second parameter into the equation that will actually be used in Ingeneue. You can see why if you remember that one of the byproducts of our non-dimensionalization is that the maximum steady-state concentrations of each Node is 1. That means even if the receptor can reach dimensional concentrations 100 times higher than the ligand, both of them will have non-dimensionalized concentrations near 1. This is a problem because to make a given number of complexes, there need to actually be enough ligand molecules around to bind to that number of receptor molecules. Thus in the non-dimensional version, we need an additional parameter that restores the information about relative concentrations of each Node. We call these the maximum concentration parameters, and denote them with the prefix 'max'. That was a fluffy explanation of where max comes from - if you want a more concrete explanation, see the appendix to our Nature paper and then work through the non-dimensionalization for the above heterodimerization equation. So, the upshot of all that is the equation you have to write is:

dComplex / dt = dimerizationRate * Receptor * Ligand * maxLigand

where the Receptor is on the same face as the Complex, and the Ligand is on the opposing face.

16. Go through steps 1 - 5 again. I'd suggest a name like SimpleDimerizeAff. In the first Affector you wrote, you got the concentration of a node by using the method Nodes[0].getIntegrationValue(side).There are actually two variables you could pass into this method. If you pass in 'side', you'll get the concentration of the Node in the same face as the Node whose concentration is being changed (the Complex in this case). If you pass in 'otherSide', you'll get the concentration of the Node from the opposing cell face. So in the above equation, you will want to use 'side' for the concentration of the Receptor and 'otherSide' for the concentration of the Ligand.

17. As in steps 6 - 9, and using the appropriate side or otherSide when getting Node concentrations, write the formula for the Affector into getValue(). You can make up any temporary variables you wish. Remember to 'return' the value at the end.

18. Follow steps 10 and 12 - 15 from above. Do NOT do step 11.

The last thing we need to do is tell the rest of Ingeneue which side each of the Nodes we're using is supposed to be on. This is done in the whichSides array that you deleted for the transcription Affector. The positions in the whichSides[] array match up with the strings in the nodeDescriptions array. For each Node in nodeDescriptions where the Affector will want the concentration in the same cell face as the Node its affecting (Receptor in this case), put a -1 into whichSides. For each Node where the concentration on the opposing face is needed (Ligand in this case), put a 1 into whichSides.

19. Place "-1"s and "1"s into the whichSides array appropriately.

That's it.



Step 7.

float proportion_active = Math.pow(activator, params[nuParam] /
               ( Math.pow(params[kParam], params[nuParam]) + 
                           Math.pow(activator, params[nuParam]) );

Step 8.

    float value = (Globals.characteristicTime / params[halfLifeParam]) * proportion_active;

Step 10.

public void setParameterNumbers(int [] param_nums) {
   kParam = param_nums[0];
   nuParam = param_nums[1];
   halfLifeParam = param_nums[2];

Step 11.

public void setLabelsAndTypes() {
    setDescriptions(affectorDescription, nodeDescriptions, paramDescriptions);
    this.type[GUI_CAPABLE] = 1;
    this.type[CERTIFICATION] = Affector.RETURNS_DERIV;
    this.type[MATHTYPE] = Affector.HH;
    this.type[TERMTYPE] = Affector.PRODUCTION;

Step 13.

[ These are suggested descriptions and the correct number of strings. ]

static final String    affectorDescription = "Single activator transcription";
static final String [] nodeDescriptions = {"Activator"};
static final String [] paramDescriptions = {"Half-maximal activator concentration",
                                            "Activator cooperativity", 
                                            "Half-life of target"};

Step 14.

int kParam;
int nuParam;
int halfLifeParam;

Go back to Tutorial 1

Go back to Tutorial 2