Impressions

≺em≻Impressions≺/em≻
Login

≺em≻Impressions≺/em≻

Nitin Agrawal
nitina@cs.wisc.edu

Quickstart

Run make to install, then run ./impress <inputfile>. The default input file is ./inputfile.

Additional software is required in extension_helpers for certain file extensions:

Input file format

Parameters

Special flags

Printing flags

Printwhat is the number of printing flags there are; use 0/1 to toggle off/on.

Printwhat: 10
ext 0
sizebin 0
size 0
initial 0
final 0
depth 0
tree 0
subdirs 0
dircountfiles 0
constraint 0
SpecialDirBias

Significant code changes

depth.cpp

The shape of the file count and base 2 log bytes by file depth distributions are hard-coded into this file as arrays. The constants Total_depthcount_prob, depthcount_prob, and depth_meansize have been modified to suit my own data; the original values are from a five-year study of Windows filesystems, so they're probably far more legit.

montecarlo.cpp

In the montecarlo function in montecarlo.cpp appears this nested loop:

list<dir> LD;
list<dir>::iterator ni;
...
int montecarlo(int numdirs) {
    ...
    LD.push_front(Dirs[0]);
    ...
    for (long i = 1; i < numdirs; i++) {
        long token_uptil_now = 0, sum_childs_plus2 = 2;
        long token = (rand() % sum_childs_plus2) + 1;
        ni = LD.begin();
        token_uptil_now += (*ni).subdirs+2;
        while (token_uptil_now < token) {
            ni++;
            token_uptil_now+= (*ni).subdirs+2;
        }
        ...
        LD.push_back(Dirs[i]);
        sum_childs_plus2+=2+1;
        ...
    }
    ...
}

This is very slow when numdirs is large, mostly due to the while loop that walks the iterator. For now, I replace this with a direct access into the middle of the iterator. This requires an iterator capable of random access, but at the same time the container being iterated still needs the capability to have elements pushed onto its front and back, so I replace the list with a dequeue. There is probably a more proper fix to this, but likely requires actually understanding the Monte Carlo simulation code.

deque<dir> LD;
deque<dir>::iterator ni;
...
int montecarlo(int numdirs) {
    ...
    for (long i = i; i < numdirs; i++) {
        ...
        ni = LD.begin();
        ni += token / 3;
        ...
    }
    ...
}