Read more#

FAQ#

I cannot use docker after installation#

If you are a sudoer, there are a few post-installation steps on Linux:

sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker
docker run hello-world # Test a hello-world

If you still encounter problems, you are encouraged to concat me for access to the original test-bed or try to start as a non-root according to the Dockerfile.

The OSS-dev version of NNSmith#

NNSmith has been sharpened towards practical and real-world usage, continously developed on GitHub. It has better features, stability, usability, and extensibility. For example, TensorFlow fuzzing is supported at this point (Oct 13, 2022) and you can install it via PyPI. Note that the OSS-dev version might not necessarily reflect the implementation mentioned in the original NNSmith paper[1]. You are encouraged to use this artifact to reflect the implementation of our ASPLOS’23 paper.

Extra experiments#

EX1: NNSmith-base coverage (Section 5.3 ablation study)#

EX1: Evaluating NNSmith base (binning disabled) on {tvm, ort}

Fuzzer type: NNSmith base (without binning);
System under test (SUT):
- TVM (LLVM CPU backend);
- ONNXRuntime (CPU backend);
Experiment time: 8 hours;
Outputs (will be used for visualization soon):
- /artifact/nnsmith/nnsmith-tvm-base/
- /artifact/nnsmith/nnsmith-ort-base/

Check the results

Figure 10: Impact of attribute binning on coverage.

../_images/ort_br_cov_venn1.png — Figure 7.(a) **ONNXRuntime**
See `./ort-binning/ort_br_cov_venn.png`#

../_images/tvm_br_cov_venn1.png — Figure 7.(b) **TVM**
See `./tvm-binning/tvm_br_cov_venn.png`#

EX2: Gradient-based value search (Section 5.3 ablation study)#

EX2: Evaluating gradient-based value search

Experiment time: 1.5 hours;
Outputs (will be used for visualization soon):
- /artifact/nnsmith/512-model-10-node-exp/
- /artifact/nnsmith/512-model-20-node-exp/
- /artifact/nnsmith/512-model-30-node-exp/

Figure 11: Effectiveness of gradient-based search

../_images/input-search-merge.png — Figure 11
See `./gradient/input-search-merge.png`#

Generate LEMON models from scratch#

Running LEMON in NNSmith’s setting is very complicated. That’s why running it from scratch is not suggested and we generated those data on the test-bed in advance.

Extra constraints for running LEMON from scratch

LEMON’s GitHub repository: Jacob-yen/LEMON;
LEMON requires GPUs and nvidia-docker2 installed;
An extra disk space of 256+GB is encouraged;
The whole experiment takes at least 5 hours (if you succeed in one pass) but in practice it could take more time due to setup complexity;

Here is the overview:

We evaluated LEMON based on LEMON’s official docker image;
We tweaked the code to make it work and compare fairly:
- The code version is in a fork;
- The main change is to disable LEMON’s testing phase, which is not necessary for the purpose of “model generation”. Note we did this change to make LEMON run faster to make the comparison fair;
- We also changed the configuration file to make it work.

Step 1: Setup LEMON’s docker image

Please refer to the environment and Redis startup sections from the original repository to setup the LEMON environment.

Step 2: Running LEMON

Don’t follow instructions in the Running LEMON section. Instead, follow the instructions below:

# In the LEMON docker container
cd /
git clone https://github.com/ganler/LEMON.git LEMON-nnsmith
cd /LEMON-nnsmith
source activate lemon
python -u -m run.mutation_executor tzer.conf

And wait for about 4 hours.

Step 3: Coverting LEMON models to ONNX models

Next copy the generated models located in /LEMON-nnsmith/lemon_outputs from docker image to local. Say you can put them in /path/to/lemon_outputs on your local machine.

Next can convert the LEMON models (i.e., Keras) to ONNX models in NNSmith’s container:

cd /artifact/
source env.sh
cd /artifact/nnsmith
python3 experiments/lemon_tf2onnx.py --lemon_output_dir /path/to/lemon_outputs \
                                     --onnx_dir /artifact/data/lemon-onnx

Now you can go back to E3: LEMON3 Coverage to continue evaluating LEMON.

Other mini-experiments?#

In this artifact, we elaborated the main experiments (Evaluating artifact) and extra experiments (Extra experiments) in the paper. There is, honestly, still a few more experiments such as Figure 8 and Figure 9. They are not included in the artifact due to their minor importance and the cost of time (e.g., Figure 9 requires re-running the coverage experiments again in another setting which could take another day) & computing resources. Nevertheless, feel free to contact the artifact author over HotCRP if you are interested in these experiments and the artifact author will setup them on demand.