Show uncertainty in Morrison et al.s GeneRank matrix

Morrison et al. define the GeneRank variant of PageRank to compute the important of genes from a set of gene expression measurement combined with a set of connectivity information about genes. They report that alpha=0.75-0.85 gives the best results. We'll work with their data an examine what happens for a large set of parameters in our mode.

Contents

Setup the experiment

This experiment should be run from the rapr/experiments/generank directory

cwd = pwd;
dirtail = 'experiments/generank';
if strcmp(cwd(end-length(dirtail)+1:end),dirtail) == 0
    warning('rapr:dir','%s should be executed from rapr/%s\n',mfilename,dirtail);
end
addpath('../../matlab'); % ensure we have the RAPr codes available

Load the data. The file generank.mat comes from http://www.biomedcentral.com/1471-2105/6/233/additional/ and then we converted the 4k by 4k matrix w_All to be a transition matrix for our codes.

However, we left the vector v = expr_data as is from the file, so we need to normalize it to be a probability vector in this code. The normalization comes from the generank.m file available from the same webpage.

load('../../data/generank.mat');
n=size(P,1);

% normalize v to have sum one
v = abs(v); v = v/max(v);

Evaluate the algorithm for uniform measures

For the GeneRank problem, the choice of A should not be driven by a random surfer model as in the PageRank case. Instead, the choice is literally an unknown parameter alpha. Consequently, a uniform distribution makes the most sense. The authors of the GeneRank paper claim that 0.75 <= alpha <= 0.85 give the most interesting results. However, what should we pick inside this interval?

In this experiment, we'll assume

% set N for the gqrapr algorithm
N=50;

pts=0:.2:1; npts = length(pts); tic;
ktex=zeros(npts); ktstdx=ktex; % kendall tau ex,stdx
for pi=1:npts,for pj=pi+1:npts
    l=pts(pi); r=pts(pj); fprintf('starting [%3.1f,%3.1f]...',l,r); tic
    d=alphadist('unif',l,r);eA=d.mf(1);eA=eA(end);
    xeA=(speye(n)-eA*P')\v; xeA=xeA./norm(xeA,1); % compute x(E(A))
    [ex stdx] = gqrapr(P,50,d,'direct',v); % compute E[x(A)], Std[x(A)]
    ktex(pi,pj)=ktau(xeA,ex); ktstdx(pi,pj)=ktau(xeA,stdx); % compute taus
    fprintf (' ... done! %f secs\n', toc);
end, end

% save the output
save 'generank-unif.mat' pts ktex ktstdx
starting [0.0,0.2]... ... done! 35.821255 secs
starting [0.0,0.4]... ... done! 33.965974 secs
starting [0.0,0.6]... ... done! 34.079978 secs
starting [0.0,0.8]... ... done! 33.965636 secs
starting [0.0,1.0]... ... done! 34.014011 secs
starting [0.2,0.4]... ... done! 34.197860 secs
starting [0.2,0.6]... ... done! 33.969022 secs
starting [0.2,0.8]... ... done! 33.911383 secs
starting [0.2,1.0]... ... done! 34.155754 secs
starting [0.4,0.6]... ... done! 33.969462 secs
starting [0.4,0.8]... ... done! 33.969453 secs
starting [0.4,1.0]... ... done! 34.150154 secs
starting [0.6,0.8]... ... done! 34.005013 secs
starting [0.6,1.0]... ... done! 33.908104 secs
starting [0.8,1.0]... ... done! 34.225825 secs

Display the output as a matlab table with colored cells. This code generates the input to a table for the paper. It colors the Matlab cells with the ktau strength. Positive values generate red cells and negative values generate blue cells. Unfortunately, we don't see any negative examples in these cases. We generate two tables, one for ex, and one for stdx.

load 'generank-unif.mat'; npts=length(pts);

ccv = @(x) [1 2-x 2-x].*[1 0.5 0.5]*(x>=0)+... %  red for pos
    [x+2 x+2 1].*[0.5 0.5 1]*(x<0);            % blue for neg
ccc = @(x) sprintf('\\cellcolor[rgb]{%0.2f,%0.2f,%0.2f}',ccv(x));

fprintf('ktau dist between x(E(A)) and E(x(A)) for A uniform \n');
disp(ktex); fprintf('... latex table code ... \n');

vals = ktex; fprintf(' & %1.1f', pts(2:end)); fprintf('\\\\ \\hline \n');
for pi=1:npts
    fprintf('%1.1f & ', pts(pi));
    for pj=1:npts
        if pj>pi, fprintf('%s %5.3f', ccc(vals(pi,pj)), vals(pi,pj)); end
        if pj==npts, fprintf('\\\\ \n'); elseif pj>1, fprintf(' & '); end
    end
end, fprintf('\n');

fprintf('ktau dist between x(E(A)) and Std(x(A)) for A uniform \n');
disp(ktstdx); fprintf('... latex table code ... \n');

vals = ktstdx; fprintf(' & %1.1f', pts(2:end)); fprintf('\\\\ \\hline \n');
for pi=1:npts
    fprintf('%1.1f & ', pts(pi));
    for pj=1:npts
        if pj>pi, fprintf('%s %5.3f', ccc(vals(pi,pj)), vals(pi,pj)); end
        if pj==npts, fprintf('\\\\ \n'); elseif pj>1, fprintf(' & '); end
    end
end, fprintf('\n');
ktau dist between x(E(A)) and E(x(A)) for A uniform 
  Columns 1 through 3

                         0         0.999153910796035         0.995902751402379
                         0                         0         0.998748454806139
                         0                         0                         0
                         0                         0                         0
                         0                         0                         0
                         0                         0                         0

  Columns 4 through 6

         0.988453636705646         0.972598044724126         0.934971190579693
         0.993625016100065         0.980177974331386         0.943758824880184
         0.997934794570811         0.988209166869682         0.954465025719488
                         0         0.995758622380715         0.967386599824762
                         0                         0         0.983612654549161
                         0                         0                         0

... latex table code ... 
 & 0.2 & 0.4 & 0.6 & 0.8 & 1.0\\ \hline 
0.0 & \cellcolor[rgb]{1.00,0.50,0.50} 0.999 & \cellcolor[rgb]{1.00,0.50,0.50} 0.996 & \cellcolor[rgb]{1.00,0.51,0.51} 0.988 & \cellcolor[rgb]{1.00,0.51,0.51} 0.973 & \cellcolor[rgb]{1.00,0.53,0.53} 0.935\\ 
0.2 &  & \cellcolor[rgb]{1.00,0.50,0.50} 0.999 & \cellcolor[rgb]{1.00,0.50,0.50} 0.994 & \cellcolor[rgb]{1.00,0.51,0.51} 0.980 & \cellcolor[rgb]{1.00,0.53,0.53} 0.944\\ 
0.4 &  &  & \cellcolor[rgb]{1.00,0.50,0.50} 0.998 & \cellcolor[rgb]{1.00,0.51,0.51} 0.988 & \cellcolor[rgb]{1.00,0.52,0.52} 0.954\\ 
0.6 &  &  &  & \cellcolor[rgb]{1.00,0.50,0.50} 0.996 & \cellcolor[rgb]{1.00,0.52,0.52} 0.967\\ 
0.8 &  &  &  &  & \cellcolor[rgb]{1.00,0.51,0.51} 0.984\\ 
1.0 &  &  &  &  & \\ 

ktau dist between x(E(A)) and Std(x(A)) for A uniform 
  Columns 1 through 3

                         0         0.166309589847077         0.211662997572823
                         0                         0         0.256138192362956
                         0                         0                         0
                         0                         0                         0
                         0                         0                         0
                         0                         0                         0

  Columns 4 through 6

         0.261256606597638         0.316928868412181         0.389021139955952
         0.304770583942663         0.355768479115878         0.414426755326689
         0.342394989568625         0.381473370237008         0.413101985078932
                         0          0.38231968037935         0.380679948203753
                         0                         0         0.325921247370399
                         0                         0                         0

... latex table code ... 
 & 0.2 & 0.4 & 0.6 & 0.8 & 1.0\\ \hline 
0.0 & \cellcolor[rgb]{1.00,0.92,0.92} 0.166 & \cellcolor[rgb]{1.00,0.89,0.89} 0.212 & \cellcolor[rgb]{1.00,0.87,0.87} 0.261 & \cellcolor[rgb]{1.00,0.84,0.84} 0.317 & \cellcolor[rgb]{1.00,0.81,0.81} 0.389\\ 
0.2 &  & \cellcolor[rgb]{1.00,0.87,0.87} 0.256 & \cellcolor[rgb]{1.00,0.85,0.85} 0.305 & \cellcolor[rgb]{1.00,0.82,0.82} 0.356 & \cellcolor[rgb]{1.00,0.79,0.79} 0.414\\ 
0.4 &  &  & \cellcolor[rgb]{1.00,0.83,0.83} 0.342 & \cellcolor[rgb]{1.00,0.81,0.81} 0.381 & \cellcolor[rgb]{1.00,0.79,0.79} 0.413\\ 
0.6 &  &  &  & \cellcolor[rgb]{1.00,0.81,0.81} 0.382 & \cellcolor[rgb]{1.00,0.81,0.81} 0.381\\ 
0.8 &  &  &  &  & \cellcolor[rgb]{1.00,0.84,0.84} 0.326\\ 
1.0 &  &  &  &  & \\ 

Evaluate the algorithm for a set of beta measures

All of our codes work for a beta distribution as well. In this case, we'll look at a beta distribution over the interval [0,1]. Using an appropraite beta function allows us to heavily weight choices in the interval [0.75,0.85], but still consider values outside that interval. A Beta(4,13) puts even approximately weight on [0.75-0.85] as on [0.65,0.75] and isn't a bad guestimate of this range. In this experiment, we'll test a range of beta distributions for values of a,b less than 16.

% set N for the gqrapr algorithm
N=50;

pts=1:3:16; npts = length(pts); tic;
ktex=zeros(npts); ktstdx=ktex; % kendall tau ex,stdx
for pi=1:npts, for pj=1:npts
    a=pts(pi); b=pts(pj); fprintf('starting (%i,%i)...',a,b); tic
    d=alphadist('beta',a,b);eA=d.mf(1);eA=eA(end);
    xeA=(speye(n)-eA*P')\v; xeA=xeA./norm(xeA,1); % compute x(E(A))
    [ex stdx] = gqrapr(P,50,d,'direct',v); % compute E[x(A)], Std[x(A)]
    ktex(pi,pj)=ktau(xeA,ex); ktstdx(pi,pj)=ktau(xeA,stdx); % compute taus
    fprintf (' ... done! %f secs\n', toc);
end, end

save 'generank-beta.mat' pts ktex ktstdx
starting (1,1)... ... done! 34.097790 secs
starting (1,4)... ... done! 33.919597 secs
starting (1,7)... ... done! 34.160217 secs
starting (1,10)... ... done! 33.837667 secs
starting (1,13)... ... done! 33.742732 secs
starting (1,16)... ... done! 33.697770 secs
starting (4,1)... ... done! 33.950472 secs
starting (4,4)... ... done! 33.495354 secs
starting (4,7)... ... done! 33.525062 secs
starting (4,10)... ... done! 33.567836 secs
starting (4,13)... ... done! 33.579954 secs
starting (4,16)... ... done! 34.099114 secs
starting (7,1)... ... done! 33.928657 secs
starting (7,4)... ... done! 34.027759 secs
starting (7,7)... ... done! 33.752655 secs
starting (7,10)... ... done! 33.937168 secs
starting (7,13)... ... done! 33.999789 secs
starting (7,16)... ... done! 34.254916 secs
starting (10,1)... ... done! 33.865671 secs
starting (10,4)... ... done! 33.984544 secs
starting (10,7)... ... done! 34.146307 secs
starting (10,10)... ... done! 33.598635 secs
starting (10,13)... ... done! 32.956274 secs
starting (10,16)... ... done! 32.921526 secs
starting (13,1)... ... done! 32.842540 secs
starting (13,4)... ... done! 33.113293 secs
starting (13,7)... ... done! 33.923637 secs
starting (13,10)... ... done! 34.066755 secs
starting (13,13)... ... done! 33.661312 secs
starting (13,16)... ... done! 33.391877 secs
starting (16,1)... ... done! 33.539222 secs
starting (16,4)... ... done! 33.716831 secs
starting (16,7)... ... done! 34.202432 secs
starting (16,10)... ... done! 33.927360 secs
starting (16,13)... ... done! 33.986841 secs
starting (16,16)... ... done! 34.110226 secs

Now write the tables, using the same color codes as before. This time, we'll write the entire table instead of skipping the lower half.

load 'generank-beta.mat'; npts=length(pts);

fprintf('ktau dist between x(E(A)) and E(x(A)) for A beta \n');
disp(ktex); fprintf('... latex table code ... \n');

vals=ktex; fprintf(' & %i', pts); fprintf('\\\\ \\hline \n');
for pi=1:npts
    fprintf('%i & ', pts(pi));
    for pj=1:npts
        fprintf('%s %5.3f', ccc(vals(pi,pj)), vals(pi,pj));
        if pj==npts, fprintf('\\\\ \n'); else, fprintf(' & '); end
    end
end, fprintf('\n');

fprintf('ktau dist between x(E(A)) and Std(x(A)) for A beta \n');
disp(ktstdx); fprintf('... latex table code ... \n');

vals = ktstdx; fprintf(' & %i', pts); fprintf('\\\\ \\hline \n');
for pi=1:npts
    fprintf('%i & ', pts(pi));
    for pj=1:npts
        fprintf('%s %5.3f', ccc(vals(pi,pj)), vals(pi,pj));
        if pj==npts, fprintf('\\\\ \n'); else, fprintf(' & '); end
    end
end, fprintf('\n');
ktau dist between x(E(A)) and E(x(A)) for A beta 
  Columns 1 through 3

         0.963935308066816         0.964999853427596         0.970467614827264
         0.989690828682611         0.984854915898576         0.984205600068107
         0.995215815358824         0.991861261393641         0.990465463934085
         0.997214513599484         0.994877965946222         0.993733541140431
          0.99819416701717         0.996481529434182         0.995466026959551
         0.998724942226051         0.997477183408322          0.99662626968346

  Columns 4 through 6

         0.975210212284452         0.978923314473741         0.981798392320387
         0.984679030863263         0.985447252588669         0.986442479274815
         0.990008586089828         0.990034541012735         0.990253057256067
         0.993048253653823         0.992735077264123         0.992662096864816
         0.994855430751868         0.994548178016351         0.994352259557271
         0.996047308686377         0.995735964470849         0.995481905084829

... latex table code ... 
 & 1 & 4 & 7 & 10 & 13 & 16\\ \hline 
1 & \cellcolor[rgb]{1.00,0.52,0.52} 0.964 & \cellcolor[rgb]{1.00,0.52,0.52} 0.965 & \cellcolor[rgb]{1.00,0.51,0.51} 0.970 & \cellcolor[rgb]{1.00,0.51,0.51} 0.975 & \cellcolor[rgb]{1.00,0.51,0.51} 0.979 & \cellcolor[rgb]{1.00,0.51,0.51} 0.982\\ 
4 & \cellcolor[rgb]{1.00,0.51,0.51} 0.990 & \cellcolor[rgb]{1.00,0.51,0.51} 0.985 & \cellcolor[rgb]{1.00,0.51,0.51} 0.984 & \cellcolor[rgb]{1.00,0.51,0.51} 0.985 & \cellcolor[rgb]{1.00,0.51,0.51} 0.985 & \cellcolor[rgb]{1.00,0.51,0.51} 0.986\\ 
7 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995 & \cellcolor[rgb]{1.00,0.50,0.50} 0.992 & \cellcolor[rgb]{1.00,0.50,0.50} 0.990 & \cellcolor[rgb]{1.00,0.50,0.50} 0.990 & \cellcolor[rgb]{1.00,0.50,0.50} 0.990 & \cellcolor[rgb]{1.00,0.50,0.50} 0.990\\ 
10 & \cellcolor[rgb]{1.00,0.50,0.50} 0.997 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995 & \cellcolor[rgb]{1.00,0.50,0.50} 0.994 & \cellcolor[rgb]{1.00,0.50,0.50} 0.993 & \cellcolor[rgb]{1.00,0.50,0.50} 0.993 & \cellcolor[rgb]{1.00,0.50,0.50} 0.993\\ 
13 & \cellcolor[rgb]{1.00,0.50,0.50} 0.998 & \cellcolor[rgb]{1.00,0.50,0.50} 0.996 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995 & \cellcolor[rgb]{1.00,0.50,0.50} 0.994\\ 
16 & \cellcolor[rgb]{1.00,0.50,0.50} 0.999 & \cellcolor[rgb]{1.00,0.50,0.50} 0.997 & \cellcolor[rgb]{1.00,0.50,0.50} 0.997 & \cellcolor[rgb]{1.00,0.50,0.50} 0.996 & \cellcolor[rgb]{1.00,0.50,0.50} 0.996 & \cellcolor[rgb]{1.00,0.50,0.50} 0.995\\ 

ktau dist between x(E(A)) and Std(x(A)) for A beta 
  Columns 1 through 3

         0.378260938868487         0.410174495690102         0.385725045698525
          0.26290536588202         0.361814368603603         0.395077169769585
         0.216902620769714          0.30526257867656         0.355458988763943
         0.193631417206659         0.268250202254638         0.319201942621671
          0.17967992375304          0.24351855340863         0.291067633393102
         0.170402735724735         0.225699164152492         0.269493241852621

  Columns 4 through 6

         0.361762024829118         0.344073344831124         0.331233988872345
         0.399240657184287         0.392466544848713         0.382965500701441
         0.381516853383627         0.392090366289958         0.393836263967129
         0.352192133959369         0.372957536322935          0.38452879414883
         0.325763744475214         0.350160396024004         0.367225311099931
         0.303385944683577         0.329414394387254         0.348785546787472

... latex table code ... 
 & 1 & 4 & 7 & 10 & 13 & 16\\ \hline 
1 & \cellcolor[rgb]{1.00,0.81,0.81} 0.378 & \cellcolor[rgb]{1.00,0.79,0.79} 0.410 & \cellcolor[rgb]{1.00,0.81,0.81} 0.386 & \cellcolor[rgb]{1.00,0.82,0.82} 0.362 & \cellcolor[rgb]{1.00,0.83,0.83} 0.344 & \cellcolor[rgb]{1.00,0.83,0.83} 0.331\\ 
4 & \cellcolor[rgb]{1.00,0.87,0.87} 0.263 & \cellcolor[rgb]{1.00,0.82,0.82} 0.362 & \cellcolor[rgb]{1.00,0.80,0.80} 0.395 & \cellcolor[rgb]{1.00,0.80,0.80} 0.399 & \cellcolor[rgb]{1.00,0.80,0.80} 0.392 & \cellcolor[rgb]{1.00,0.81,0.81} 0.383\\ 
7 & \cellcolor[rgb]{1.00,0.89,0.89} 0.217 & \cellcolor[rgb]{1.00,0.85,0.85} 0.305 & \cellcolor[rgb]{1.00,0.82,0.82} 0.355 & \cellcolor[rgb]{1.00,0.81,0.81} 0.382 & \cellcolor[rgb]{1.00,0.80,0.80} 0.392 & \cellcolor[rgb]{1.00,0.80,0.80} 0.394\\ 
10 & \cellcolor[rgb]{1.00,0.90,0.90} 0.194 & \cellcolor[rgb]{1.00,0.87,0.87} 0.268 & \cellcolor[rgb]{1.00,0.84,0.84} 0.319 & \cellcolor[rgb]{1.00,0.82,0.82} 0.352 & \cellcolor[rgb]{1.00,0.81,0.81} 0.373 & \cellcolor[rgb]{1.00,0.81,0.81} 0.385\\ 
13 & \cellcolor[rgb]{1.00,0.91,0.91} 0.180 & \cellcolor[rgb]{1.00,0.88,0.88} 0.244 & \cellcolor[rgb]{1.00,0.85,0.85} 0.291 & \cellcolor[rgb]{1.00,0.84,0.84} 0.326 & \cellcolor[rgb]{1.00,0.82,0.82} 0.350 & \cellcolor[rgb]{1.00,0.82,0.82} 0.367\\ 
16 & \cellcolor[rgb]{1.00,0.91,0.91} 0.170 & \cellcolor[rgb]{1.00,0.89,0.89} 0.226 & \cellcolor[rgb]{1.00,0.87,0.87} 0.269 & \cellcolor[rgb]{1.00,0.85,0.85} 0.303 & \cellcolor[rgb]{1.00,0.84,0.84} 0.329 & \cellcolor[rgb]{1.00,0.83,0.83} 0.349\\